Abstract
This paper describes our approach to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. We normalized text-based tweets during pre-processing, following which we utilized a bi-directional gated recurrent unit with an attention mechanism to build our base model. Multi-models with or without class weights were trained for the ensemble methods. We boosted models without class weights, and only strong boost classifiers were identified. In our system, not only was a boosting method used, but we also took advantage of the voting ensemble method to enhance our final system result. Our method demonstrated an obvious improvement of approximately 3% of the macro F1 score in English and 2% in Spanish.- Anthology ID:
- S18-1073
- Volume:
- Proceedings of the 12th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 459–465
- Language:
- URL:
- https://aclanthology.org/S18-1073
- DOI:
- 10.18653/v1/S18-1073
- Cite (ACL):
- Nan Wang, Jin Wang, and Xuejie Zhang. 2018. YNU-HPCC at SemEval-2018 Task 2: Multi-ensemble Bi-GRU Model with Attention Mechanism for Multilingual Emoji Prediction. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 459–465, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- YNU-HPCC at SemEval-2018 Task 2: Multi-ensemble Bi-GRU Model with Attention Mechanism for Multilingual Emoji Prediction (Wang et al., SemEval 2018)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/S18-1073.pdf