NARC – Norwegian Anaphora Resolution Corpus
Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, Lilja Øvrelid
Abstract
We present the Norwegian Anaphora Resolution Corpus (NARC), the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian. The paper describes the annotated data for 326 documents in Norwegian Bokmål, together with inter-annotator agreement and discussions of relevant statistics. We also present preliminary modelling results which are comparable to existing corpora for other languages, and discuss relevant problems in relation to both modelling and the annotations themselves.- Anthology ID:
- 2022.crac-1.6
- Volume:
- Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Maciej Ogrodniczuk, Sameer Pradhan, Anna Nedoluzhko, Vincent Ng, Massimo Poesio
- Venue:
- CRAC
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 48–60
- Language:
- URL:
- https://aclanthology.org/2022.crac-1.6
- DOI:
- Cite (ACL):
- Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, and Lilja Øvrelid. 2022. NARC – Norwegian Anaphora Resolution Corpus. In Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 48–60, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- NARC – Norwegian Anaphora Resolution Corpus (Mæhlum et al., CRAC 2022)
- PDF:
- https://preview.aclanthology.org/landing_page/2022.crac-1.6.pdf
- Code
- ltgoslo/narc
- Data
- BASHI, NorNE