Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil
Shantanu Patankar, Omkar Gokhale, Onkar Litake, Aditya Mandke, Dipali Kadam
Abstract
This paper tries to address the problem of abusive comment detection in low-resource indic languages. Abusive comments are statements that are offensive to a person or a group of people. These comments are targeted toward individuals belonging to specific ethnicities, genders, caste, race, sexuality, etc. Abusive Comment Detection is a significant problem, especially with the recent rise in social media users. This paper presents the approach used by our team — Optimize_Prime, in the ACL 2022 shared task “Abusive Comment Detection in Tamil.” This task detects and classifies YouTube comments in Tamil and Tamil-English Codemixed format into multiple categories. We have used three methods to optimize our results: Ensemble models, Recurrent Neural Networks, and Transformers. In the Tamil data, MuRIL and XLM-RoBERTA were our best performing models with a macro-averaged f1 score of 0.43. Furthermore, for the Code-mixed data, MuRIL and M-BERT provided sublime results, with a macro-averaged f1 score of 0.45.- Anthology ID:
- 2022.dravidianlangtech-1.36
- Volume:
- Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Parameswari Krishnamurthy, Elizabeth Sherly, Sinnathamby Mahesan
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 235–239
- Language:
- URL:
- https://aclanthology.org/2022.dravidianlangtech-1.36
- DOI:
- 10.18653/v1/2022.dravidianlangtech-1.36
- Cite (ACL):
- Shantanu Patankar, Omkar Gokhale, Onkar Litake, Aditya Mandke, and Dipali Kadam. 2022. Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 235–239, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil (Patankar et al., DravidianLangTech 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2022.dravidianlangtech-1.36.pdf