BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Shibo Hao; Bowen Tan; Kaiwen Tang; Bin Ni; Xiyan Shao; Hengzhe Zhang; Eric Xing; Zhiting Hu

BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric Xing, Zhiting Hu

Abstract

It is crucial to automatically construct knowledge graphs (KGs) of diverse new relations to support knowledge discovery and broad applications. Previous KG construction methods, based on either crowdsourcing or text mining, are often limited to a small predefined set of relations due to manual cost or restrictions in text corpus. Recent research proposed to use pretrained language models (LMs) as implicit knowledge bases that accept knowledge queries with prompts. Yet, the implicit knowledge lacks many desirable properties of a full-scale symbolic KG, such as easy access, navigation, editing, and quality assurance. In this paper, we propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition (a prompt and a few shot of example entity pairs), the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge of the desired relation. We develop an effective search-and-rescore mechanism for improved efficiency and accuracy. We deploy the approach to harvest KGs of over 400 new relations, from LMs of varying capacities such as RoBERTaNet. Extensive human and automatic evaluations show our approach manages to extract diverse accurate knowledge, including tuples of complex relations (e.g., “A is capable of but not good at B”). The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs’ knowledge capacities.

Anthology ID:: 2023.findings-acl.309
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5000–5015
Language:
URL:: https://aclanthology.org/2023.findings-acl.309
DOI:
Bibkey:
Cite (ACL):: Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric Xing, and Zhiting Hu. 2023. BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5000–5015, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models (Hao et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/paclic-22-ingestion/2023.findings-acl.309.pdf

PDF Search