DE-ABUSE@TamilNLP-ACL 2022: Transliteration as Data Augmentation for Abuse Detection in Tamil
Vasanth Palanikumar, Sean Benhur, Adeep Hande, Bharathi Raja Chakravarthi
Abstract
With the rise of social media and internet, thereis a necessity to provide an inclusive space andprevent the abusive topics against any gender,race or community. This paper describes thesystem submitted to the ACL-2022 shared taskon fine-grained abuse detection in Tamil. In ourapproach we transliterated code-mixed datasetas an augmentation technique to increase thesize of the data. Using this method we wereable to rank 3rd on the task with a 0.290 macroaverage F1 score and a 0.590 weighted F1score- Anthology ID:
- 2022.dravidianlangtech-1.5
- Volume:
- Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33–38
- Language:
- URL:
- https://aclanthology.org/2022.dravidianlangtech-1.5
- DOI:
- 10.18653/v1/2022.dravidianlangtech-1.5
- Cite (ACL):
- Vasanth Palanikumar, Sean Benhur, Adeep Hande, and Bharathi Raja Chakravarthi. 2022. DE-ABUSE@TamilNLP-ACL 2022: Transliteration as Data Augmentation for Abuse Detection in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 33–38, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- DE-ABUSE@TamilNLP-ACL 2022: Transliteration as Data Augmentation for Abuse Detection in Tamil (Palanikumar et al., DravidianLangTech 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.dravidianlangtech-1.5.pdf