RED v2: Enhancing RED Dataset for Multi-Label Emotion Detection
Alexandra Ciobotaru, Mihai Vlad Constantinescu, Liviu P. Dinu, Stefan Dumitrescu
Abstract
RED (Romanian Emotion Dataset) is a machine learning-based resource developed for the automatic detection of emotions in Romanian texts, containing single-label annotated tweets with one of the following emotions: joy, fear, sadness, anger and neutral. In this work, we propose REDv2, an open-source extension of RED by adding two more emotions, trust and surprise, and by widening the annotation schema so that the resulted novel dataset is multi-label. We show the overall reliability of our dataset by computing inter-annotator agreements per tweet using a formula suitable for our annotation setup and we aggregate all annotators’ opinions into two variants of ground truth, one suitable for multi-label classification and the other suitable for text regression. We propose strong baselines with two transformer models, the Romanian BERT and the multilingual XLM-Roberta model, in both categorical and regression settings.- Anthology ID:
- 2022.lrec-1.149
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 1392–1399
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.lrec-1.149/
- DOI:
- Cite (ACL):
- Alexandra Ciobotaru, Mihai Vlad Constantinescu, Liviu P. Dinu, and Stefan Dumitrescu. 2022. RED v2: Enhancing RED Dataset for Multi-Label Emotion Detection. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1392–1399, Marseille, France. European Language Resources Association.
- Cite (Informal):
- RED v2: Enhancing RED Dataset for Multi-Label Emotion Detection (Ciobotaru et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.lrec-1.149.pdf