Text-based NP Enrichment

Yanai Elazar, Victoria Basmov, Yoav Goldberg, Reut Tsarfaty


Abstract
Understanding the relations between entities denoted by NPs in a text is a critical part of human-like natural language understanding. However, only a fraction of such relations is covered by standard NLP tasks and benchmarks nowadays. In this work, we propose a novel task termed text-based NP enrichment (TNE), in which we aim to enrich each NP in a text with all the preposition-mediated relations—either explicit or implicit—that hold between it and other NPs in the text. The relations are represented as triplets, each denoted by two NPs related via a preposition. Humans recover such relations seamlessly, while current state-of-the-art models struggle with them due to the implicit nature of the problem. We build the first large-scale dataset for the problem, provide the formal framing and scope of annotation, analyze the data, and report the results of fine-tuned language models on the task, demonstrating the challenge it poses to current technology. A webpage with a data-exploration UI, a demo, and links to the code, models, and leaderboard, to foster further research into this challenging problem can be found at: yanaiela.github.io/TNE/.
Anthology ID:
2022.tacl-1.44
Volume:
Transactions of the Association for Computational Linguistics, Volume 10
Month:
Year:
2022
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
764–784
Language:
URL:
https://aclanthology.org/2022.tacl-1.44
DOI:
10.1162/tacl_a_00488
Bibkey:
Cite (ACL):
Yanai Elazar, Victoria Basmov, Yoav Goldberg, and Reut Tsarfaty. 2022. Text-based NP Enrichment. Transactions of the Association for Computational Linguistics, 10:764–784.
Cite (Informal):
Text-based NP Enrichment (Elazar et al., TACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.tacl-1.44.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2022.tacl-1.44.mp4