EM-26@LT-EDI 2025: Detecting Racial Hoaxes in Code-Mixed Social Media Data

Tewodros Achamaleh; Fatima Uroosa; Nida Hafeez; Tolulope Olalekan Abiola; Mikiyas Mebraihtu; Sara Getachew; Grigori Sidorov; Rolando Quintero

EM-26@LT-EDI 2025: Detecting Racial Hoaxes in Code-Mixed Social Media Data

Tewodros Achamaleh, Fatima Uroosa, Nida Hafeez, Tolulope Olalekan Abiola, Mikiyas Mebraihtu, Sara Getachew, Grigori Sidorov, Rolando Quintero

Abstract

Social media platforms and user-generated content, such as tweets, comments, and blog posts often contain offensive language, including racial hate speech, personal attacks, and sexual harassment. Detecting such inappropriate language is essential to ensure user safety and to prevent the spread of hateful behavior and online aggression. Approaches base on conventional machine learning and deep learning have shown robust results for high-resource languages like English and find it hard to deal with code-mixed text, which is common in bilingual communication. We participated in the shared task “LT-EDI@LDK 2025” organized by DravidianLangTech, applying the BERT-base multilingual cased model and achieving an F1 score of 0.63. These results demonstrate how our model effectively processes and interprets the unique linguistic features of code-mixed content. The source code is available on GitHub.1

Anthology ID:: 2025.ltedi-1.25
Volume:: Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:: September
Year:: 2025
Address:: Naples, Italy
Editors:: Katerina Gkirtzou, Slavko Žitnik, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:: LTEDI | WS
SIG:
Publisher:: Unior Press
Note:
Pages:: 146–152
Language:
URL:: https://preview.aclanthology.org/add-orcids-2024-emnlp/2025.ltedi-1.25/
DOI:
Bibkey:
Cite (ACL):: Tewodros Achamaleh, Fatima Uroosa, Nida Hafeez, Tolulope Olalekan Abiola, Mikiyas Mebraihtu, Sara Getachew, Grigori Sidorov, and Rolando Quintero. 2025. EM-26@LT-EDI 2025: Detecting Racial Hoaxes in Code-Mixed Social Media Data. In Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 146–152, Naples, Italy. Unior Press.
Cite (Informal):: EM-26@LT-EDI 2025: Detecting Racial Hoaxes in Code-Mixed Social Media Data (Achamaleh et al., LTEDI 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-orcids-2024-emnlp/2025.ltedi-1.25.pdf

PDF Cite Search Fix data