Hariharan LekshmiAmmal


2022

pdf
NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil
Hariharan LekshmiAmmal | Manikandan Ravikiran | Anand Kumar Madasamy
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages

Toxic span identification in Tamil is a shared task that focuses on identifying harmful content, contributing to offensiveness. In this work, we have built a model that can efficiently identify the span of text contributing to offensive content. We have used various transformer-based models to develop the system, out of which the fine-tuned MuRIL model was able to achieve the best overall character F1-score of 0.4489.