Document-level Entity-based Extraction as Template Generation

Kung-Hsiang Huang, Sam Tang, Nanyun Peng


Abstract
Document-level entity-based extraction (EE), aiming at extracting entity-centric information such as entity roles and entity relations, is key to automatic knowledge acquisition from text corpora for various domains. Most document-level EE systems build extractive models, which struggle to model long-term dependencies among entities at the document level. To address this issue, we propose a generative framework for two document-level EE tasks: role-filler entity extraction (REE) and relation extraction (RE). We first formulate them as a template generation problem, allowing models to efficiently capture cross-entity dependencies, exploit label semantics, and avoid the exponential computation complexity of identifying N-ary relations. A novel cross-attention guided copy mechanism, TopK Copy, is incorporated into a pre-trained sequence-to-sequence model to enhance the capabilities of identifying key information in the input document. Experiments done on the MUC-4 and SciREX dataset show new state-of-the-art results on REE (+3.26%), binary RE (+4.8%), and 4-ary RE (+2.7%) in F1 score.
Anthology ID:
2021.emnlp-main.426
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5257–5269
Language:
URL:
https://aclanthology.org/2021.emnlp-main.426
DOI:
10.18653/v1/2021.emnlp-main.426
Bibkey:
Cite (ACL):
Kung-Hsiang Huang, Sam Tang, and Nanyun Peng. 2021. Document-level Entity-based Extraction as Template Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5257–5269, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Document-level Entity-based Extraction as Template Generation (Huang et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2021.emnlp-main.426.pdf
Video:
 https://preview.aclanthology.org/ingest-2024-clasp/2021.emnlp-main.426.mp4
Code
 PlusLabNLP/TempGen
Data
MUC-4SciREX