Abstract
Distant supervision can generate large-scale relation classification data quickly and economi-cally. However a great number of noise sentences are introduced which can not express their labeled relations. By means of pre-trained language model BERT’s powerful function in this paper we propose a BERT-based semantic denoising approach for distantly supervised relation classification. In detail we define an entity pair as a source entity and a target entity. For the specific sentences whose target entities in BERT-vocabulary (one-token word) we present the differences of dependency between two entities for noise and non-noise sentences. For general sentences whose target entity is multi-token word we further present the differences of last hid-den states of [MASK]-entity (MASK-lhs for short) in BERT for noise and non-noise sentences.We regard the dependency and MASK-lhs in BERT as two semantic features of sentences. With BERT we capture the dependency feature to discriminate specific sentences first then capturethe MASK-lhs feature to denoise distant supervision datasets. We propose NS-Hunter a noveldenoising model which leverages BERT-cloze ability to capture the two semantic features andintegrates above functions. According to the experiment on NYT data our NS-Hunter modelachieves the best results in distant supervision denoising and sentence-level relation classification. Keywords: Distant supervision relation classification semantic denoisingIntroduction- Anthology ID:
- 2021.ccl-1.99
- Volume:
- Proceedings of the 20th Chinese National Conference on Computational Linguistics
- Month:
- August
- Year:
- 2021
- Address:
- Huhhot, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 1109–1120
- Language:
- English
- URL:
- https://aclanthology.org/2021.ccl-1.99
- DOI:
- Cite (ACL):
- Shen Tielin, Wang Daling, Feng Shi, and Zhang Yifei. 2021. NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1109–1120, Huhhot, China. Chinese Information Processing Society of China.
- Cite (Informal):
- NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification (Tielin et al., CCL 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.ccl-1.99.pdf