Abstract
In this study, we address the task of Sentiment Analysis for Bangla Social Media Posts, introduced in first Workshop on Bangla Language Processing (CITATION). Our research encountered two significant challenges in the context of sentiment analysis. The first challenge involved extensive training times and memory constraints when we chose to employ oversampling techniques for addressing class imbalance in an attempt to enhance model performance. Conversely, when opting for undersampling, the training time was optimal, but this approach resulted in poor model performance. These challenges highlight the complex trade-offs involved in selecting sampling methods to address class imbalances in sentiment analysis tasks. We tackle these challenges through cost-sensitive approaches aimed at enhancing model performance. In our initial submission during the evaluation phase, we ranked 9th out of 30 participants with an F1-micro score of 0.7088 . Subsequently, through additional experimentation, we managed to elevate our F1-micro score to 0.7186 by leveraging the BanglaBERT-Large model in combination with the Self-adjusting Dice loss function. Our experiments highlight the effect in performance of the models achieved by modifying the loss function. Our experimental data and source code can be found here.- Anthology ID:
- 2023.banglalp-1.46
- Volume:
- Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
- Venue:
- BanglaLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 340–346
- Language:
- URL:
- https://aclanthology.org/2023.banglalp-1.46
- DOI:
- 10.18653/v1/2023.banglalp-1.46
- Cite (ACL):
- S.m Towhidul Islam Tonmoy. 2023. Embeddings at BLP-2023 Task 2: Optimizing Fine-Tuned Transformers with Cost-Sensitive Learning for Multiclass Sentiment Analysis. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 340–346, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Embeddings at BLP-2023 Task 2: Optimizing Fine-Tuned Transformers with Cost-Sensitive Learning for Multiclass Sentiment Analysis (Tonmoy, BanglaLP 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.banglalp-1.46.pdf