Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework
Talha Bedir, Karahan Şahin, Onur Gungor, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, Balkiz Ozturk Basaran
Abstract
This paper presents several challenges faced when annotating Turkish treebanks in accordance with the Universal Dependencies (UD) guidelines and proposes solutions to address them. Most of these challenges stem from the lack of adequate support in the UD framework to accurately represent null morphemes and complex derivations, which results in a significant loss of information for Turkish. This loss negatively impacts the tools that are developed based on these treebanks. We raised and discussed these issues within the community on the official UD portal. This paper presents these issues and our proposals to more accurately represent morphosyntactic information for Turkish while adhering to guidelines of UD. This work aims to contribute to the representation of Turkish and other agglutinative languages in UD-based treebanks, which in turn aids to develop more accurately annotated datasets for such languages.- Anthology ID:
- 2021.law-1.12
- Volume:
- Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Claire Bonial, Nianwen Xue
- Venue:
- LAW
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 112–122
- Language:
- URL:
- https://aclanthology.org/2021.law-1.12
- DOI:
- 10.18653/v1/2021.law-1.12
- Cite (ACL):
- Talha Bedir, Karahan Şahin, Onur Gungor, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, and Balkiz Ozturk Basaran. 2021. Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework. In Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop, pages 112–122, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework (Bedir et al., LAW 2021)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2021.law-1.12.pdf