Abstract
This paper describes our system used for a shared task on code-mixed, less-resourced sentiment analysis for Indo-Aryan languages. We are using the large language models (LLMs) since they have demonstrated excellent performance on classification tasks. In our participation in all tracks, we use unsloth/mistral-7b-bnb-4bit LLM for the task of code-mixed sentiment analysis. For track 1, we used a simple fine-tuning strategy on PLMs by combining data from multiple phases. Our trained systems secured first place in four phases out of five. In addition, we present the results achieved using several PLMs for each language.- Anthology ID:
- 2024.wildre-1.9
- Volume:
- Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Girish Nath Jha, Sobha L., Kalika Bali, Atul Kr. Ojha
- Venues:
- WILDRE | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 59–65
- Language:
- URL:
- https://aclanthology.org/2024.wildre-1.9
- DOI:
- Cite (ACL):
- Gaurish Thakkar, Marko Tadić, and Nives Mikelic Preradovic. 2024. FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis. In Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation, pages 59–65, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis (Thakkar et al., WILDRE-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.wildre-1.9.pdf