KEC-Elite-Analysts@LT-EDI 2025: Leveraging Deep Learning for Racial Hoax Detection in Code-Mixed Hindi-English Tweets

Malliga Subramanian; Aruna A; Amudhavan M; Jahaganapathi S; Kogilavani Shanmugavadivel

KEC-Elite-Analysts@LT-EDI 2025: Leveraging Deep Learning for Racial Hoax Detection in Code-Mixed Hindi-English Tweets

Malliga Subramanian, Aruna A, Amudhavan M, Jahaganapathi S, Kogilavani Shanmugavadivel

Abstract

Detecting misinformation in code-mixed languages, particularly Hindi-English, poses significant challenges in natural language processing due to the linguistic diversity found on social media. This paper focuses on racial hoax detection—false narratives that target specific communities—within Hindi-English YouTube comments. We evaluate the effectiveness of several machine learning models, including Logistic Regression, Random Forest, Support Vector Machine, Naive Bayes, and Multi-Layer Perceptron, using a dataset of 5,105 annotated comments. Model performance is assessed using accuracy, precision, recall, and F1-score. Experimental results indicate that neural and ensemble models consistently outperform traditional classifiers. Future work will explore the use of transformer-based architectures and data augmentation techniques to enhance detection in low-resource, code-mixed scenarios.

Anthology ID:: 2025.ltedi-1.19
Volume:: Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:: September
Year:: 2025
Address:: Naples, Italy
Editors:: Katerina Gkirtzou, Slavko Žitnik, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:: LTEDI | WS
SIG:
Publisher:: Unior Press
Note:
Pages:: 111–115
Language:
URL:: https://preview.aclanthology.org/corrections-2025-10/2025.ltedi-1.19/
DOI:
Bibkey:
Cite (ACL):: Malliga Subramanian, Aruna A, Amudhavan M, Jahaganapathi S, and Kogilavani Shanmugavadivel. 2025. KEC-Elite-Analysts@LT-EDI 2025: Leveraging Deep Learning for Racial Hoax Detection in Code-Mixed Hindi-English Tweets. In Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 111–115, Naples, Italy. Unior Press.
Cite (Informal):: KEC-Elite-Analysts@LT-EDI 2025: Leveraging Deep Learning for Racial Hoax Detection in Code-Mixed Hindi-English Tweets (Subramanian et al., LTEDI 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2025-10/2025.ltedi-1.19.pdf

PDF Cite Search Fix data