Abstract
In this paper, we present the building of a CCGbank for Turkish by using standardised dependency corpora. We automatically induce Combinatory Categorial Grammar (CCG) categories for each word token in the Turkish dependency corpora. The CCG induction algorithm we present here is based on the dependency relations that are defined in the latest release of the Universal Dependencies (UD) framework. We aim for an algorithm that can easily be used in all the Turkish treebanks that are annotated in this framework. Therefore, we employ a lexicalist approach in order to make full use of the dependency relations while creating a semantically transparent corpus. We present the treebanks we employed in this study as well as their annotation framework. We introduce the structure of the algorithm we used along with the specific issues that are different from previous studies. Lastly, we show how the results change with this lexical approach in CCGbank for Turkish compared to the previous CCGbank studies in Turkish.- Anthology ID:
- 2023.gwc-1.25
- Volume:
- Proceedings of the 12th Global Wordnet Conference
- Month:
- January
- Year:
- 2023
- Address:
- University of the Basque Country, Donostia - San Sebastian, Basque Country
- Editors:
- German Rigau, Francis Bond, Alexandre Rademaker
- Venue:
- GWC
- SIG:
- Publisher:
- Global Wordnet Association
- Note:
- Pages:
- 205–213
- Language:
- URL:
- https://aclanthology.org/2023.gwc-1.25
- DOI:
- Cite (ACL):
- Aslı Kuzgun, Oğuz Kerem Yıldız, and Olcay Taner Yildiz. 2023. A CCGbank for Turkish: From Dependency to CCG. In Proceedings of the 12th Global Wordnet Conference, pages 205–213, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
- Cite (Informal):
- A CCGbank for Turkish: From Dependency to CCG (Kuzgun et al., GWC 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.gwc-1.25.pdf