A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature

Andreas van Cranenburgh, Esther Ploeger, Frank van den Berg, Remi Thüss


Abstract
We introduce a modular, hybrid coreference resolution system that extends a rule-based baseline with three neural classifiers for the subtasks mention detection, mention attributes (gender, animacy, number), and pronoun resolution. The classifiers substantially increase coreference performance in our experiments with Dutch literature across all metrics on the development set: mention detection, LEA, CoNLL, and especially pronoun accuracy. However, on the test set, the best results are obtained with rule-based pronoun resolution. This inconsistent result highlights that the rule-based system is still a strong baseline, and more work is needed to improve pronoun resolution robustly for this dataset. While end-to-end neural systems require no feature engineering and achieve excellent performance in standard benchmarks with large training sets, our simple hybrid system scales well to long document coreference (>10k words) and attains superior results in our experiments on literature.
Anthology ID:
2021.crac-1.5
Volume:
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Maciej Ogrodniczuk, Sameer Pradhan, Massimo Poesio, Yulia Grishina, Vincent Ng
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
47–56
Language:
URL:
https://aclanthology.org/2021.crac-1.5
DOI:
10.18653/v1/2021.crac-1.5
Bibkey:
Cite (ACL):
Andreas van Cranenburgh, Esther Ploeger, Frank van den Berg, and Remi Thüss. 2021. A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature. In Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 47–56, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature (van Cranenburgh et al., CRAC 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2021.crac-1.5.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2021.crac-1.5.mp4
Code
 andreasvc/dutchcoref