A CCGbank for Turkish: From Dependency to CCG

Aslı Kuzgun, Oğuz Kerem Yıldız, Olcay Taner Yildiz


Abstract
In this paper, we present the building of a CCGbank for Turkish by using standardised dependency corpora. We automatically induce Combinatory Categorial Grammar (CCG) categories for each word token in the Turkish dependency corpora. The CCG induction algorithm we present here is based on the dependency relations that are defined in the latest release of the Universal Dependencies (UD) framework. We aim for an algorithm that can easily be used in all the Turkish treebanks that are annotated in this framework. Therefore, we employ a lexicalist approach in order to make full use of the dependency relations while creating a semantically transparent corpus. We present the treebanks we employed in this study as well as their annotation framework. We introduce the structure of the algorithm we used along with the specific issues that are different from previous studies. Lastly, we show how the results change with this lexical approach in CCGbank for Turkish compared to the previous CCGbank studies in Turkish.
Anthology ID:
2023.gwc-1.25
Volume:
Proceedings of the 12th Global Wordnet Conference
Month:
January
Year:
2023
Address:
University of the Basque Country, Donostia - San Sebastian, Basque Country
Editors:
German Rigau, Francis Bond, Alexandre Rademaker
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
205–213
Language:
URL:
https://aclanthology.org/2023.gwc-1.25
DOI:
Bibkey:
Cite (ACL):
Aslı Kuzgun, Oğuz Kerem Yıldız, and Olcay Taner Yildiz. 2023. A CCGbank for Turkish: From Dependency to CCG. In Proceedings of the 12th Global Wordnet Conference, pages 205–213, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
Cite (Informal):
A CCGbank for Turkish: From Dependency to CCG (Kuzgun et al., GWC 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2023.gwc-1.25.pdf