Hongyuan Xu

2025

pdf bib abs
DTDES-KGE: Dual-Teacher Knowledge Distillation with Distinct Embedding Spaces for Knowledge Graph Embeddings
Bofan Wei | Hongyuan Xu | Yuhang Niu | Jiarui Ren | Yanlong Wen | Xiaojie Yuan
Findings of the Association for Computational Linguistics: EMNLP 2025

Knowledge distillation for knowledge graph embedding (KGE) models effectively compresses KGE models by reducing their embedding dimensions. While existing methods distill knowledge from a high-dimensional teacher to a low-dimensional student, they typically rely on a single teacher embedding space, thereby overlooking valuable complementary knowledge from teachers in distinct embedding spaces. This paper introduces DTDES-KGE, a novel knowledge distillation framework that significantly enhances distillation performance by leveraging dual teachers in distinct embedding spaces. To overcome the challenge of spatial heterogeneity when integrating knowledge from dual teachers, we propose a spatial compatibility module for reconciliation. Additionally, we introduce a student-aware knowledge fusion mechanism to fuse the knowledge from dual teachers dynamically. Extensive experiments on two real-world datasets validate the effectiveness of DTDES-KGE.

pdf bib abs
TaxoPro: A Plug-In LoRA-based Cross-Domain Method for Low-Resource Taxonomy Completion
Hongyuan Xu | Yuhang Niu | Ciyi Liu | Yanlong Wen | Xiaojie Yuan
Transactions of the Association for Computational Linguistics, Volume 13

Low-resource taxonomy completion aims to automatically insert new concepts into the existing taxonomy, in which only a few in-domain training samples are available. Recent studies have achieved considerable progress by incorporating prior knowledge from pre-trained language models (PLMs). However, these studies tend to overly rely on such knowledge and neglect the shareable knowledge across different taxonomies. In this paper, we propose TaxoPro, a plug-in LoRA-based cross-domain method, that captures shareable knowledge from the high- resource taxonomy to improve PLM-based low-resource taxonomy completion techniques. To prevent negative interference between domain-specific and domain-shared knowledge, TaxoPro decomposes cross- domain knowledge into domain-shared and domain-specific components, storing them using low-rank matrices (LoRA). Additionally, TaxoPro employs two auxiliary losses to regulate the flow of shareable knowledge. Experimental results demonstrate that TaxoPro improves PLM-based techniques, achieving state-of-the-art performance in completing low-resource taxonomies. Code is available at https://github.com/cyclexu/TaxoPro.

2023

Automatic taxonomy completion aims to attach the emerging concept to an appropriate pair of hypernym and hyponym in the existing taxonomy. Existing methods suffer from the overfitting to leaf-only problem caused by imbalanced leaf and non-leaf samples when training the newly initialized classification head. Besides, they only leverage subtasks, namely attaching the concept to its hypernym or hyponym, as auxiliary supervision for representation learning yet neglect the effects of subtask results on the final prediction. To address the aforementioned limitations, we propose TacoPrompt, a Collaborative Multi-Task Prompt Learning Method for Self-Supervised Taxonomy Completion. First, we perform triplet semantic matching using the prompt learning paradigm to effectively learn non-leaf attachment ability from imbalanced training samples. Second, we design the result context to relate the final prediction to the subtask results by a contextual approach, enhancing prompt-based multi-task learning. Third, we leverage a two-stage retrieval and re-ranking approach to improve the inference efficiency. Experimental results on three datasets show that TacoPrompt achieves state-of-the-art taxonomy completion performance. Codes are available at https://github.com/cyclexu/TacoPrompt.

2021

As an essential form of knowledge representation, taxonomies are widely used in various downstream natural language processing tasks. However, with the continuously rising of new concepts, many existing taxonomies are unable to maintain coverage by manual expansion. In this paper, we propose TEMP, a self-supervised taxonomy expansion method, which predicts the position of new concepts by ranking the generated taxonomy-paths. For the first time, TEMP employs pre-trained contextual encoders in taxonomy construction and hypernym detection problems. Experiments prove that pre-trained contextual embeddings are able to capture hypernym-hyponym relations. To learn more detailed differences between taxonomy-paths, we train the model with dynamic margin loss by a novel dynamic margin function. Extensive evaluations exhibit that TEMP outperforms prior state-of-the-art taxonomy expansion approaches by 14.3% in accuracy and 15.8% in mean reciprocal rank on three public benchmarks.

Co-authors

Venues

Fix author