Abstract
We present our submission to the Semeval 2018 task on emoji prediction. We used a random forest, with an ensemble of bag-of-words, sentiment and psycholinguistic features. Although we performed well on the trial dataset (attaining a macro f-score of 63.185 for English and 81.381 for Spanish), our approach did not perform as well on the test data. We describe our features and classi cation protocol, as well as initial experiments, concluding with a discussion of the discrepancy between our trial and test results.- Anthology ID:
- S18-1079
- Volume:
- Proceedings of the 12th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venues:
- SemEval | *SEM
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 491–496
- Language:
- URL:
- https://aclanthology.org/S18-1079
- DOI:
- 10.18653/v1/S18-1079
- Cite (ACL):
- Luciano Gerber and Matthew Shardlow. 2018. Manchester Metropolitan at SemEval-2018 Task 2: Random Forest with an Ensemble of Features for Predicting Emoji in Tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 491–496, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Manchester Metropolitan at SemEval-2018 Task 2: Random Forest with an Ensemble of Features for Predicting Emoji in Tweets (Gerber & Shardlow, SemEval-*SEM 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/S18-1079.pdf