Exploiting Noisy Data in Distant Supervision Relation Classification
Kaijia Yang, Liang He, Xin-yu Dai, Shujian Huang, Jiajun Chen
Abstract
Distant supervision has obtained great progress on relation classification task. However, it still suffers from noisy labeling problem. Different from previous works that underutilize noisy data which inherently characterize the property of classification, in this paper, we propose RCEND, a novel framework to enhance Relation Classification by Exploiting Noisy Data. First, an instance discriminator with reinforcement learning is designed to split the noisy data into correctly labeled data and incorrectly labeled data. Second, we learn a robust relation classifier in semi-supervised learning way, whereby the correctly and incorrectly labeled data are treated as labeled and unlabeled data respectively. The experimental results show that our method outperforms the state-of-the-art models.- Anthology ID:
- N19-1325
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3216–3225
- Language:
- URL:
- https://aclanthology.org/N19-1325
- DOI:
- 10.18653/v1/N19-1325
- Cite (ACL):
- Kaijia Yang, Liang He, Xin-yu Dai, Shujian Huang, and Jiajun Chen. 2019. Exploiting Noisy Data in Distant Supervision Relation Classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3216–3225, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Exploiting Noisy Data in Distant Supervision Relation Classification (Yang et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/N19-1325.pdf