Abstract
Sentiment analysis in Code-Mixed languages has garnered a lot of attention in recent years. It is an important task for social media monitoring and has many applications, as a large chunk of social media data is Code-Mixed. In this paper, we work on the problem of sentiment analysis for Dravidian Code-Switched languages - Tamil-Engish and Malayalam-English, using three different BERT based models. We leverage task-specific pre-training and cross-lingual transfer to improve on previously reported results, with significant improvement for the Tamil-Engish dataset. We also present a multilingual sentiment classification model that has competitive performance on both Tamil-English and Malayalam-English datasets.- Anthology ID:
- 2021.dravidianlangtech-1.9
- Volume:
- Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- April
- Year:
- 2021
- Address:
- Kyiv
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Parameswari Krishnamurthy, Elizabeth Sherly
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 73–79
- Language:
- URL:
- https://aclanthology.org/2021.dravidianlangtech-1.9
- DOI:
- Cite (ACL):
- Akshat Gupta, Sai Krishna Rallabandi, and Alan W Black. 2021. Task-Specific Pre-Training and Cross Lingual Transfer for Sentiment Analysis in Dravidian Code-Switched Languages. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 73–79, Kyiv. Association for Computational Linguistics.
- Cite (Informal):
- Task-Specific Pre-Training and Cross Lingual Transfer for Sentiment Analysis in Dravidian Code-Switched Languages (Gupta et al., DravidianLangTech 2021)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2021.dravidianlangtech-1.9.pdf
- Data
- SentiMix, TweetEval