Cross-lingual Emotion Detection

Sabit Hassan, Shaden Shaar, Kareem Darwish


Abstract
Emotion detection can provide us with a window into understanding human behavior. Due to the complex dynamics of human emotions, however, constructing annotated datasets to train automated models can be expensive. Thus, we explore the efficacy of cross-lingual approaches that would use data from a source language to build models for emotion detection in a target language. We compare three approaches, namely: i) using inherently multilingual models; ii) translating training data into the target language; and iii) using an automatically tagged parallel corpus. In our study, we consider English as the source language with Arabic and Spanish as target languages. We study the effectiveness of different classification models such as BERT and SVMs trained with different features. Our BERT-based monolingual models that are trained on target language data surpass state-of-the-art (SOTA) by 4% and 5% absolute Jaccard score for Arabic and Spanish respectively. Next, we show that using cross-lingual approaches with English data alone, we can achieve more than 90% and 80% relative effectiveness of the Arabic and Spanish BERT models respectively. Lastly, we use LIME to analyze the challenges of training cross-lingual models for different language pairs.
Anthology ID:
2022.lrec-1.751
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6948–6958
Language:
URL:
https://aclanthology.org/2022.lrec-1.751
DOI:
Bibkey:
Cite (ACL):
Sabit Hassan, Shaden Shaar, and Kareem Darwish. 2022. Cross-lingual Emotion Detection. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6948–6958, Marseille, France. European Language Resources Association.
Cite (Informal):
Cross-lingual Emotion Detection (Hassan et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.751.pdf