Building a Manually Annotated Hungarian Coreference Corpus: Workflow and Tools

Noémi Vadász


Abstract
This paper presents the complete workflow of building a manually annotated Hungarian corpus, KorKor, with particular reference to anaphora and coreference annotation. All linguistic annotation layers were corrected manually. The corpus is freely available in two formats. The paper gives insight into the process of setting up the workflow and the challenges that have arisen.
Anthology ID:
2022.crac-1.5
Volume:
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
38–47
Language:
URL:
https://aclanthology.org/2022.crac-1.5
DOI:
Bibkey:
Cite (ACL):
Noémi Vadász. 2022. Building a Manually Annotated Hungarian Coreference Corpus: Workflow and Tools. In Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 38–47, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Building a Manually Annotated Hungarian Coreference Corpus: Workflow and Tools (Vadász, CRAC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.crac-1.5.pdf