Lemlem Eyob Kawo
2025
CIC-NLP@DravidianLangTech 2025: Detecting AI-generated Product Reviews in Dravidian Languages
Tewodros Achamaleh
|
Tolulope Olalekan Abiola
|
Lemlem Eyob Kawo
|
Mikiyas Mebraihtu
|
Grigori Sidorov
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
AI-generated text now matches human writing so well that telling them apart is very difficult. Our CIC-NLP team submits results for the DravidianLangTech@NAACL 2025 shared task to reveal AI-generated product reviews in Dravidian languages. We performed a binary classification task with XLM-RoBERTa-Base using the DravidianLangTech@NAACL 2025 datasets offered by the event organizers. Through training the model correctly, our tests could tell between human and AI-generated reviews with scores of 0.96 for Tamil and 0.88 for Malayalam in the evaluation test set. This paper presents detailed information about preprocessing, model architecture, hyperparameter fine-tuning settings, the experimental process, and the results. The source code is available on GitHub1.