Abstract
Lexical substitution is a task of determining a meaning-preserving replacement for a word in context. We report on a preliminary study of this task for the Croatian language on a small-scale lexical sample dataset, manually annotated using three different annotation schemes. We compare the annotations, analyze the inter-annotator agreement, and observe a number of interesting language specific details in the obtained lexical substitutes. Furthermore, we apply a recently-proposed, dependency-based lexical substitution model to our dataset. The model achieves a P@3 score of 0.35, which indicates the difficulty of the task.- Anthology ID:
- W17-1403
- Volume:
- Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
- Venue:
- BSNLP
- SIG:
- SIGSLAV
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14–19
- Language:
- URL:
- https://aclanthology.org/W17-1403
- DOI:
- 10.18653/v1/W17-1403
- Cite (ACL):
- Domagoj Alagić and Jan Šnajder. 2017. A Preliminary Study of Croatian Lexical Substitution. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 14–19, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- A Preliminary Study of Croatian Lexical Substitution (Alagić & Šnajder, BSNLP 2017)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/W17-1403.pdf