Abstract
We present a number of models used for hate speech detection for Semeval 2019 Task-5: Hateval. We evaluate the viability of multilingual learning for this task. We also experiment with adversarial learning as a means of creating a multilingual model. Ultimately our multilingual models have had worse results than their monolignual counterparts. We find that the choice of word representations (word embeddings) is very crucial for deep learning as a simple switch between MUSE and ELMo embeddings has shown a 3-4% increase in accuracy. This also shows the importance of context when dealing with online content.- Anthology ID:
- S19-2082
- Volume:
- Proceedings of the 13th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota, USA
- Editors:
- Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 464–468
- Language:
- URL:
- https://aclanthology.org/S19-2082
- DOI:
- 10.18653/v1/S19-2082
- Cite (ACL):
- Michal Bojkovský and Matúš Pikuliak. 2019. STUFIIT at SemEval-2019 Task 5: Multilingual Hate Speech Detection on Twitter with MUSE and ELMo Embeddings. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 464–468, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
- Cite (Informal):
- STUFIIT at SemEval-2019 Task 5: Multilingual Hate Speech Detection on Twitter with MUSE and ELMo Embeddings (Bojkovský & Pikuliak, SemEval 2019)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/S19-2082.pdf