Iterative Document-level Information Extraction via Imitation Learning
Yunmo Chen, William Gantt, Weiwei Gu, Tongfei Chen, Aaron White, Benjamin Van Durme
Abstract
We present a novel iterative extraction model, IterX, for extracting complex relations, or templates, i.e., N-tuples representing a mapping from named slots to spans of text within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template’s slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks – 4-ary relation extraction on SciREX and template extraction on MUC-4 – as well as a strong baseline on the new BETTER Granular task.- Anthology ID:
- 2023.eacl-main.136
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Andreas Vlachos, Isabelle Augenstein
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1858–1874
- Language:
- URL:
- https://aclanthology.org/2023.eacl-main.136
- DOI:
- 10.18653/v1/2023.eacl-main.136
- Award:
- EACL Outstanding Paper
- Cite (ACL):
- Yunmo Chen, William Gantt, Weiwei Gu, Tongfei Chen, Aaron White, and Benjamin Van Durme. 2023. Iterative Document-level Information Extraction via Imitation Learning. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1858–1874, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Iterative Document-level Information Extraction via Imitation Learning (Chen et al., EACL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.eacl-main.136.pdf