NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil
Hariharan LekshmiAmmal, Manikandan Ravikiran, Anand Kumar Madasamy
Abstract
Toxic span identification in Tamil is a shared task that focuses on identifying harmful content, contributing to offensiveness. In this work, we have built a model that can efficiently identify the span of text contributing to offensive content. We have used various transformer-based models to develop the system, out of which the fine-tuned MuRIL model was able to achieve the best overall character F1-score of 0.4489.- Anthology ID:
- 2022.dravidianlangtech-1.12
- Volume:
- Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 75–78
- Language:
- URL:
- https://aclanthology.org/2022.dravidianlangtech-1.12
- DOI:
- 10.18653/v1/2022.dravidianlangtech-1.12
- Cite (ACL):
- Hariharan LekshmiAmmal, Manikandan Ravikiran, and Anand Kumar Madasamy. 2022. NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 75–78, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil (LekshmiAmmal et al., DravidianLangTech 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.dravidianlangtech-1.12.pdf