Umut Özge


2024

pdf
A coreference corpus of Turkish situated dialogs
Faruk Büyüktekin | Umut Özge
Proceedings of the First Workshop on Natural Language Processing for Turkic Languages (SIGTURK 2024)

The paper introduces a publicly available corpus of Turkish situated dialogs annotated for coreference. We developed an annotation scheme for coreference annotation in Turkish, a language with pro-drop and rich agglutinating morphology. The annotation scheme is tailored for these aspects of the language, making it potentially applicable to similar languages. The corpus comprises 60 dialogs containing in total 3900 sentences, 18360 words, and 6120 mentions.

2004

pdf
Development of a Corpus Workbench for the METU Turkish Corpus
Umut Özge | Bilge Say
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)