The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation
Hao Peng, Xiaozhi Wang, Feng Yao, Kaisheng Zeng, Lei Hou, Juanzi Li, Zhiyuan Liu, Weixing Shen
Abstract
Event extraction (EE) is a crucial task aiming at extracting events from texts, which includes two subtasks: event detection (ED) and event argument extraction (EAE). In this paper, we check the reliability of EE evaluations and identify three major pitfalls: (1) The data preprocessing discrepancy makes the evaluation results on the same dataset not directly comparable, but the data preprocessing details are not widely noted and specified in papers. (2) The output space discrepancy of different model paradigms makes different-paradigm EE models lack grounds for comparison and also leads to unclear mapping issues between predictions and annotations. (3) The absence of pipeline evaluation of many EAE-only works makes them hard to be directly compared with EE works and may not well reflect the model performance in real-world pipeline scenarios. We demonstrate the significant influence of these pitfalls through comprehensive meta-analyses of recent papers and empirical experiments. To avoid these pitfalls, we suggest a series of remedies, including specifying data preprocessing, standardizing outputs, and providing pipeline evaluation results. To help implement these remedies, we develop a consistent evaluation framework OmniEvent, which can be obtained from https://github.com/THU-KEG/OmniEvent.- Anthology ID:
- 2023.findings-acl.586
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9206–9227
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.586
- DOI:
- 10.18653/v1/2023.findings-acl.586
- Cite (ACL):
- Hao Peng, Xiaozhi Wang, Feng Yao, Kaisheng Zeng, Lei Hou, Juanzi Li, Zhiyuan Liu, and Weixing Shen. 2023. The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9206–9227, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation (Peng et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2023.findings-acl.586.pdf