Abstract
In Task 9, we are required to analyze the textual intimacy of tweets in 10 languages. We fine-tune XLM-RoBERTa (XLM-R) pre-trained model to adapt to this multilingual regression task. After tentative experiments, severe class imbalance is observed in the official released dataset, which may compromise the convergence and weaken the model effect. To tackle such challenge, we take measures in two aspects. On the one hand, we implement data augmentation through machine translation to enlarge the scale of classes with fewer samples. On the other hand, we introduce focal mean square error (MSE) loss to emphasize the contributions of hard samples to total loss, thus further mitigating the impact of class imbalance on model effect. Extensive experiments demonstrate remarkable effectiveness of our strategies, and our model achieves high performance on the Pearson’s correlation coefficient (CC) almost above 0.85 on validation dataset.- Anthology ID:
- 2023.semeval-1.9
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 77–82
- Language:
- URL:
- https://aclanthology.org/2023.semeval-1.9
- DOI:
- 10.18653/v1/2023.semeval-1.9
- Cite (ACL):
- Yuxi Chen, Yu Chang, Yanqing Tao, and Yanru Zhang. 2023. Sea_and_Wine at SemEval-2023 Task 9: A Regression Model with Data Augmentation for Multilingual Intimacy Analysis. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 77–82, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Sea_and_Wine at SemEval-2023 Task 9: A Regression Model with Data Augmentation for Multilingual Intimacy Analysis (Chen et al., SemEval 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.semeval-1.9.pdf