Xiangyu Lin


2023

pdf
Self-distilled Transitive Instance Weighting for Denoised Distantly Supervised Relation Extraction
Xiangyu Lin | Weijia Jia | Zhiguo Gong
Findings of the Association for Computational Linguistics: EMNLP 2023

The widespread existence of wrongly labeled instances is a challenge to distantly supervised relation extraction. Most of the previous works are trained in a bag-level setting to alleviate such noise. However, sentence-level training better utilizes the information than bag-level training, as long as combined with effective noise alleviation. In this work, we propose a novel Transitive Instance Weighting mechanism integrated with the self-distilled BERT backbone, utilizing information in the intermediate outputs to generate dynamic instance weights for denoised sentence-level training. By down-weighting wrongly labeled instances and discounting the weights of easy-to-fit ones, our method can effectively tackle wrongly labeled instances and prevent overfitting. Experiments on both held-out and manual datasets indicate that our method achieves state-of-the-art performance and consistent improvements over the baselines.

2021

pdf
Distantly Supervised Relation Extraction using Multi-Layer Revision Network and Confidence-based Multi-Instance Learning
Xiangyu Lin | Tianyi Liu | Weijia Jia | Zhiguo Gong
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Distantly supervised relation extraction is widely used in the construction of knowledge bases due to its high efficiency. However, the automatically obtained instances are of low quality with numerous irrelevant words. In addition, the strong assumption of distant supervision leads to the existence of noisy sentences in the sentence bags. In this paper, we propose a novel Multi-Layer Revision Network (MLRN) which alleviates the effects of word-level noise by emphasizing inner-sentence correlations before extracting relevant information within sentences. Then, we devise a balanced and noise-resistant Confidence-based Multi-Instance Learning (CMIL) method to filter out noisy sentences as well as assign proper weights to relevant ones. Extensive experiments on two New York Times (NYT) datasets demonstrate that our approach achieves significant improvements over the baselines.

2020

pdf
Regularized Attentive Capsule Network for Overlapped Relation Extraction
Tianyi Liu | Xiangyu Lin | Weijia Jia | Mingliang Zhou | Wei Zhao
Proceedings of the 28th International Conference on Computational Linguistics

Distantly supervised relation extraction has been widely applied in knowledge base construction due to its less requirement of human efforts. However, the automatically established training datasets in distant supervision contain low-quality instances with noisy words and overlapped relations, introducing great challenges to the accurate extraction of relations. To address this problem, we propose a novel Regularized Attentive Capsule Network (RA-CapNet) to better identify highly overlapped relations in each informal sentence. To discover multiple relation features in an instance, we embed multi-head attention into the capsule network as the low-level capsules, where the subtraction of two entities acts as a new form of relation query to select salient features regardless of their positions. To further discriminate overlapped relation features, we devise disagreement regularization to explicitly encourage the diversity among both multiple attention heads and low-level capsules. Extensive experiments conducted on widely used datasets show that our model achieves significant improvements in relation extraction.