Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking

Yuanhe Tian, Yan Song, Fei Xia


Abstract
Relation extraction (RE) is an important natural language processing task that predicts the relation between two given entities, where a good understanding of the contextual information is essential to achieve an outstanding model performance. Among different types of contextual information, the auto-generated syntactic information (namely, word dependencies) has shown its effectiveness for the task. However, most existing studies require modifications to the existing baseline architectures (e.g., adding new components, such as GCN, on the top of an encoder) to leverage the syntactic information. To offer an alternative solution, we propose to leverage syntactic information to improve RE by training a syntax-induced encoder on auto-parsed data through dependency masking. Specifically, the syntax-induced encoder is trained by recovering the masked dependency connections and types in first, second, and third orders, which significantly differs from existing studies that train language models or word embeddings by predicting the context words along the dependency paths. Experimental results on two English benchmark datasets, namely, ACE2005EN and SemEval 2010 Task 8 datasets, demonstrate the effectiveness of our approach for RE, where our approach outperforms strong baselines and achieve state-of-the-art results on both datasets.
Anthology ID:
2022.findings-acl.147
Original:
2022.findings-acl.147v1
Version 2:
2022.findings-acl.147v2
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1875–1886
Language:
URL:
https://aclanthology.org/2022.findings-acl.147
DOI:
10.18653/v1/2022.findings-acl.147
Bibkey:
Cite (ACL):
Yuanhe Tian, Yan Song, and Fei Xia. 2022. Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1875–1886, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking (Tian et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-acl.147.pdf
Software:
 2022.findings-acl.147.software.zip
Code
 synlp/RE-DMP
Data
Penn TreebankSemEval-2010 Task-8