NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification

Shen Tielin, Wang Daling, Feng Shi, Zhang Yifei


Abstract
“Distant supervision can generate large-scale relation classification data quickly and economi-cally. However a great number of noise sentences are introduced which can not express their labeled relations. By means of pre-trained language model BERT’s powerful function in this paper we propose a BERT-based semantic denoising approach for distantly supervised relation classification. In detail we define an entity pair as a source entity and a target entity. For the specific sentences whose target entities in BERT-vocabulary (one-token word) we present the differences of dependency between two entities for noise and non-noise sentences. For general sentences whose target entity is multi-token word we further present the differences of last hid-den states of [MASK]-entity (MASK-lhs for short) in BERT for noise and non-noise sentences.We regard the dependency and MASK-lhs in BERT as two semantic features of sentences. With BERT we capture the dependency feature to discriminate specific sentences first then capturethe MASK-lhs feature to denoise distant supervision datasets. We propose NS-Hunter a noveldenoising model which leverages BERT-cloze ability to capture the two semantic features andintegrates above functions. According to the experiment on NYT data our NS-Hunter modelachieves the best results in distant supervision denoising and sentence-level relation classification. Keywords: Distant supervision relation classification semantic denoisingIntroduction”
Anthology ID:
2021.ccl-1.99
Volume:
Proceedings of the 20th Chinese National Conference on Computational Linguistics
Month:
August
Year:
2021
Address:
Huhhot, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1109–1120
Language:
English
URL:
https://aclanthology.org/2021.ccl-1.99
DOI:
Bibkey:
Cite (ACL):
Shen Tielin, Wang Daling, Feng Shi, and Zhang Yifei. 2021. NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1109–1120, Huhhot, China. Chinese Information Processing Society of China.
Cite (Informal):
NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification (Tielin et al., CCL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.ccl-1.99.pdf