Multi-word Expressions for Abusive Speech Detection in Serbian
Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev
Abstract
This paper presents our work on the refinement and improvement of the Serbian language part of Hurtlex, a multilingual lexicon of words to hurt. We pay special attention to adding Multi-word expressions that can be seen as abusive, as such lexical entries are very important in obtaining good results in a plethora of abusive language detection tasks. We use Serbian morphological dictionaries as a basis for data cleaning and MWE dictionary creation. A connection to other lexical and semantic resources in Serbian is outlined and building of abusive language detection systems based on that connection is foreseen.- Anthology ID:
- 2020.mwe-1.10
- Volume:
- Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
- Month:
- December
- Year:
- 2020
- Address:
- online
- Editors:
- Stella Markantonatou, John McCrae, Jelena Mitrović, Carole Tiberius, Carlos Ramisch, Ashwini Vaidya, Petya Osenova, Agata Savary
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 74–84
- Language:
- URL:
- https://aclanthology.org/2020.mwe-1.10
- DOI:
- Cite (ACL):
- Ranka Stanković, Jelena Mitrović, Danka Jokić, and Cvetana Krstev. 2020. Multi-word Expressions for Abusive Speech Detection in Serbian. In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 74–84, online. Association for Computational Linguistics.
- Cite (Informal):
- Multi-word Expressions for Abusive Speech Detection in Serbian (Stanković et al., MWE 2020)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2020.mwe-1.10.pdf