DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction
MeiHan Tong, Bin Xu, Shuai Wang, Meihuan Han, Yixin Cao, Jiangqi Zhu, Siyu Chen, Lei Hou, Juanzi Li
Abstract
Event extraction aims to identify an event and then extract the arguments participating in the event. Despite the great success in sentence-level event extraction, events are more naturally presented in the form of documents, with event arguments scattered in multiple sentences. However, a major barrier to promote document-level event extraction has been the lack of large-scale and practical training and evaluation datasets. In this paper, we present DocEE, a new document-level event extraction dataset including 27,000+ events, 180,000+ arguments. We highlight three features: large-scale manual annotations, fine-grained argument types and application-oriented settings. Experiments show that there is still a big gap between state-of-the-art models and human beings (41% Vs 85% in F1 score), indicating that DocEE is an open issue. DocEE is now available at https://github.com/tongmeihan1995/DocEE.git.- Anthology ID:
- 2022.naacl-main.291
- Volume:
- Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3970–3982
- Language:
- URL:
- https://aclanthology.org/2022.naacl-main.291
- DOI:
- 10.18653/v1/2022.naacl-main.291
- Cite (ACL):
- MeiHan Tong, Bin Xu, Shuai Wang, Meihuan Han, Yixin Cao, Jiangqi Zhu, Siyu Chen, Lei Hou, and Juanzi Li. 2022. DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3970–3982, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction (Tong et al., NAACL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.naacl-main.291.pdf
- Code
- tongmeihan1995/docee
- Data
- WikiEvents