MUCS@DravidianLangTech@ACL2022: Ensemble of Logistic Regression Penalties to Identify Emotions in Tamil Text

Asha Hegde, Sharal Coelho, Hosahalli Shashirekha


Abstract
Emotion Analysis (EA) is the process of automatically analyzing and categorizing the input text into one of the predefined sets of emotions. In recent years, people have turned to social media to express their emotions, opinions or feelings about news, movies, products, services, and so on. These users’ emotions may help the public, governments, business organizations, film producers, and others in devising strategies, making decisions, and so on. The increasing number of social media users and the increasing amount of user generated text containing emotions on social media demands automated tools for the analysis of such data as handling this data manually is labor intensive and error prone. Further, the characteristics of social media data makes the EA challenging. Most of the EA research works have focused on English language leaving several Indian languages including Tamil unexplored for this task. To address the challenges of EA in Tamil texts, in this paper, we - team MUCS, describe the model submitted to the shared task on Emotion Analysis in Tamil at DravidianLangTech@ACL 2022. Out of the two subtasks in this shared task, our team submitted the model only for Task a. The proposed model comprises of an Ensemble of Logistic Regression (LR) classifiers with three penalties, namely: L1, L2, and Elasticnet. This Ensemble model trained with Term Frequency - Inverse Document Frequency (TF-IDF) of character bigrams and trigrams secured 4th rank in Task a with a macro averaged F1-score of 0.04. The code to reproduce the proposed models is available in github1.
Anthology ID:
2022.dravidianlangtech-1.23
Volume:
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Parameswari Krishnamurthy, Elizabeth Sherly, Sinnathamby Mahesan
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
145–150
Language:
URL:
https://aclanthology.org/2022.dravidianlangtech-1.23
DOI:
10.18653/v1/2022.dravidianlangtech-1.23
Bibkey:
Cite (ACL):
Asha Hegde, Sharal Coelho, and Hosahalli Shashirekha. 2022. MUCS@DravidianLangTech@ACL2022: Ensemble of Logistic Regression Penalties to Identify Emotions in Tamil Text. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 145–150, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
MUCS@DravidianLangTech@ACL2022: Ensemble of Logistic Regression Penalties to Identify Emotions in Tamil Text (Hegde et al., DravidianLangTech 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.dravidianlangtech-1.23.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-1/2022.dravidianlangtech-1.23.mp4