Abstract
In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.- Anthology ID:
- 2021.dravidianlangtech-1.44
- Volume:
- Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- April
- Year:
- 2021
- Address:
- Kyiv
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 307–312
- Language:
- URL:
- https://aclanthology.org/2021.dravidianlangtech-1.44
- DOI:
- Cite (ACL):
- Sai Muralidhar Jayanthi and Akshat Gupta. 2021. SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 307–312, Kyiv. Association for Computational Linguistics.
- Cite (Informal):
- SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification (Jayanthi & Gupta, DravidianLangTech 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.dravidianlangtech-1.44.pdf
- Code
- murali1996/eacl2021-OffensEval-Dravidian