Livin Nector Dhasan
2025
Necto@DravidianLangTech 2025: Fine-tuning Multilingual MiniLM for Text Classification in Dravidian Languages
Livin Nector Dhasan
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
This paper explores the application of a fine-tuned Multilingual MiniLM model for various binary text classification tasks, including AI-generated product review detection, abusive language targeting woman detection, and fake news detection in the Dravidian languages Tamil and Malayalam. This work was done as part of submissions to shared tasks organized by DravidianLangTech@NAACL 2025. The model was fine-tuned using both Tamil and Malayalam datasets, and its performance was evaluated across different tasks using macro F1-score. The results indicate that this model produces performance that is very close to the best F1 score reported by other teams. An investigation is conducted on the AI-generated product review dataset and the findings are reported.