NARCNorwegian Anaphora Resolution Corpus

Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, Lilja Øvrelid


Abstract
We present the Norwegian Anaphora Resolution Corpus (NARC), the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian. The paper describes the annotated data for 326 documents in Norwegian Bokmål, together with inter-annotator agreement and discussions of relevant statistics. We also present preliminary modelling results which are comparable to existing corpora for other languages, and discuss relevant problems in relation to both modelling and the annotations themselves.
Anthology ID:
2022.crac-1.6
Volume:
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Maciej Ogrodniczuk, Sameer Pradhan, Anna Nedoluzhko, Vincent Ng, Massimo Poesio
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
48–60
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2022.crac-1.6/
DOI:
Bibkey:
Cite (ACL):
Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, and Lilja Øvrelid. 2022. NARC – Norwegian Anaphora Resolution Corpus. In Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 48–60, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
NARC – Norwegian Anaphora Resolution Corpus (Mæhlum et al., CRAC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2022.crac-1.6.pdf