Lemlem Eyob Kawo

2025

pdf bib abs
CIC-NLP@DravidianLangTech 2025: Detecting AI-generated Product Reviews in Dravidian Languages
Tewodros Achamaleh | Tolulope Olalekan Abiola | Lemlem Eyob Kawo | Mikiyas Mebraihtu | Grigori Sidorov
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

AI-generated text now matches human writing so well that telling them apart is very difficult. Our CIC-NLP team submits results for the DravidianLangTech@NAACL 2025 shared task to reveal AI-generated product reviews in Dravidian languages. We performed a binary classification task with XLM-RoBERTa-Base using the DravidianLangTech@NAACL 2025 datasets offered by the event organizers. Through training the model correctly, our tests could tell between human and AI-generated reviews with scores of 0.96 for Tamil and 0.88 for Malayalam in the evaluation test set. This paper presents detailed information about preprocessing, model architecture, hyperparameter fine-tuning settings, the experimental process, and the results. The source code is available on GitHub1.

Co-authors

Venues

Fix data