Cognitext@DravidianLangTech2025: Fake News Classification in Malayalam Using mBERT and LSTM

Shriya Alladi, Bharathi B


Abstract
Fake news detection is a crucial task in combat- ing misinformation, particularly in underrepresented languages such as Malayalam. This paper focuses on detecting fake news in Dravidian languages using two tasks: Social Media Text Classification and News Classification. We employ a fine-tuned multilingual BERT (mBERT) model for classifying a given social media text into original or fake and an LSTM-based architecture for accurately detecting and classifying fake news articles in the Malayalam language into different categories.Extensive preprocessing techniques, such as tokenization and text cleaning, were used to ensure data quality. Our experiments achieved significant accuracy rates and F1- scores. The study’s contributions include applying advanced machine learning techniques to the Malayalam language, addressing the lack of research on low-resource languages, and highlighting the challenges of fake news detection in multilingual and code-mixed environments.
Anthology ID:
2025.dravidianlangtech-1.64
Volume:
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
May
Year:
2025
Address:
Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
361–365
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.dravidianlangtech-1.64/
DOI:
Bibkey:
Cite (ACL):
Shriya Alladi and Bharathi B. 2025. Cognitext@DravidianLangTech2025: Fake News Classification in Malayalam Using mBERT and LSTM. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 361–365, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Cognitext@DravidianLangTech2025: Fake News Classification in Malayalam Using mBERT and LSTM (Alladi & B, DravidianLangTech 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.dravidianlangtech-1.64.pdf