SU-NLP at SemEval-2020 Task 12: Offensive Language IdentifiCation in Turkish Tweets

Anil Ozdemir, Reyyan Yeniterzi


Abstract
This paper summarizes our group’s efforts in the offensive language identification shared task, which is organized as part of the International Workshop on Semantic Evaluation (Sem-Eval2020). Our final submission system is an ensemble of three different models, (1) CNN-LSTM, (2) BiLSTM-Attention and (3) BERT. Word embeddings, which were pre-trained on tweets, are used while training the first two models. BERTurk, which is the first BERT model for Turkish, is also explored. Our final submitted approach ranked as the second best model in the Turkish sub-task.
Anthology ID:
2020.semeval-1.288
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2171–2176
Language:
URL:
https://aclanthology.org/2020.semeval-1.288
DOI:
10.18653/v1/2020.semeval-1.288
Bibkey:
Cite (ACL):
Anil Ozdemir and Reyyan Yeniterzi. 2020. SU-NLP at SemEval-2020 Task 12: Offensive Language IdentifiCation in Turkish Tweets. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2171–2176, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
SU-NLP at SemEval-2020 Task 12: Offensive Language IdentifiCation in Turkish Tweets (Ozdemir & Yeniterzi, SemEval 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.semeval-1.288.pdf