Investigating Cross-Document Event Coreference for Dutch

Loic De Langhe, Orphee De Clercq, Veronique Hoste


Abstract
In this paper we present baseline results for Event Coreference Resolution (ECR) in Dutch using gold-standard (i.e non-predicted) event mentions. A newly developed benchmark dataset allows us to properly investigate the possibility of creating ECR systems for both within and cross-document coreference. We give an overview of the state of the art for ECR in other languages, as well as a detailed overview of existing ECR resources. Afterwards, we provide a comparative report on our own dataset. We apply a significant number of approaches that have been shown to attain good results for English ECR including feature-based models, monolingual transformer language models and multilingual language models. The best results were obtained using the monolingual BERTje model. Finally, results for all models are thoroughly analysed and visualised, as to provide insight into the inner workings of ECR and long-distance semantic NLP tasks in general.
Anthology ID:
2022.crac-1.9
Volume:
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
88–98
Language:
URL:
https://aclanthology.org/2022.crac-1.9
DOI:
Bibkey:
Cite (ACL):
Loic De Langhe, Orphee De Clercq, and Veronique Hoste. 2022. Investigating Cross-Document Event Coreference for Dutch. In Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 88–98, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Investigating Cross-Document Event Coreference for Dutch (De Langhe et al., CRAC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.crac-1.9.pdf
Data
ECB+