Abstract
The present study describes our submission to SemEval 2018 Task 1: Affect in Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) translating training data from other languages and (ii) applying a semi-supervised learning method. We find strong support for both approaches, with those models outperforming our regular models in all subtasks. However, creating a stepwise ensemble of different models as opposed to simply averaging did not result in an increase in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and fifth (V-Oc) in the four Spanish subtasks we participated in.- Anthology ID:
- S18-1041
- Volume:
- Proceedings of the 12th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venues:
- SemEval | *SEM
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 279–285
- Language:
- URL:
- https://aclanthology.org/S18-1041
- DOI:
- 10.18653/v1/S18-1041
- Cite (ACL):
- Marloes Kuijper, Mike van Lenthe, and Rik van Noord. 2018. UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 279–285, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish (Kuijper et al., SemEval-*SEM 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/S18-1041.pdf