Automatic Error Analysis for Document-level Information Extraction
Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, Claire Cardie
Abstract
Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts. Evaluation of the approaches, however, has been limited in a number of dimensions. In particular, the precision/recall/F1 scores typically reported provide few insights on the range of errors the models make. We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. We employ our framework to compare two state-of-the-art document-level template-filling approaches on datasets from three domains; and then, to gauge progress in IE since its inception 30 years ago, vs. four systems from the MUC-4 (1992) evaluation.- Anthology ID:
- 2022.acl-long.274
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3960–3975
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.274
- DOI:
- 10.18653/v1/2022.acl-long.274
- Cite (ACL):
- Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, and Claire Cardie. 2022. Automatic Error Analysis for Document-level Information Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3960–3975, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Automatic Error Analysis for Document-level Information Extraction (Das et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.acl-long.274.pdf
- Code
- icejinx33/auto-err-template-fill
- Data
- MUC-4, SciREX