Cheng Yan

2025

pdf bib abs
Commonsense Subgraph for Inductive Relation Reasoning with Meta-learning
Feng Zhao | Zhilu Zhang | Cheng Yan | Xianggan Liu
Proceedings of the 31st International Conference on Computational Linguistics

In knowledge graphs (KGs), predicting missing relations is a critical reasoning task. Recent subgraph-based models have delved into inductive settings, which aim to predict relations between newly added entities. While these models have demonstrated the ability for inductive reasoning, they only consider the structural information of the subgraph and neglect the loss of semantic information caused by replacing entities with nodes. To address this problem, we propose a novel Commonsense Subgraph Meta-Learning (CSML) model. Specifically, we extract concepts from entities, which can be viewed as high-level semantic information. Unlike previous methods, we use concepts instead of nodes to construct commonsense subgraphs. By combining these with structural subgraphs, we can leverage both structural and semantic information for more comprehensive and rational predictions. Furthermore, we regard concepts as meta-information and employ meta-learning to facilitate rapid knowledge transfer, thus addressing more complex few-shot scenarios. Experimental results confirm the superior performance of our model in both standard and few-shot inductive reasoning.

pdf bib abs
Priority on High-Quality: Selecting Instruction Data via Consistency Verification of Noise Injection
Hong Zhang | Feng Zhao | Ruilin Zhao | Cheng Yan | Kangzheng Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large Language Models (LLMs) have demonstrated a remarkable understanding of language nuances through instruction tuning, enabling them to effectively tackle various natural language processing tasks. Recent research has focused on the quality of instruction data rather than the quantity of instructions. However, existing high-quality instruction selection methods rely on external models or rules, overlooking the intrinsic association between pre-trained model and instruction data, making it difficult to select data that align with the preferences of pre-trained model. To address this challenge, we propose a strategy that utilizes noise injection to identify the quality of instruction data, without relying on external model. We also implement the strategy of combining inter-class diversity and intra-class diversity to improve model performance. The experimental results demonstrate that our method significantly outperforms the model trained on the entire dataset and established baselines. Our study provides a new perspective on noise injection in the field of instruction tuning, and also illustrates that the pre-trained model itself should be considered in defining high-quality. Additionally, we publish our selected high-quality instruction data.

2024

pdf bib abs
Correcting Language Model Bias for Text Classification in True Zero-Shot Learning
Feng Zhao | Wan Xianlin | Cheng Yan | Chu Kiong Loo
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Combining pre-trained language models (PLMs) and manual templates is a common practice for text classification in zero-shot scenarios. However, the effect of this approach is highly volatile, ranging from random guesses to near state-of-the-art results, depending on the quality of the manual templates. In this paper, we show that this instability stems from the fact that language models tend toward predicting certain label words of text classification, and manual templates can influence this tendency. To address this, we develop a novel pipeline for annotating and filtering a few examples from unlabeled examples. Moreover, we propose a new method to measure model bias on label words that utilizes unlabeled examples as a validation set when tuning language models. Our approach does not require any pre-labeled examples. Experimental results on six text classification tasks demonstrate that the proposed approach significantly outperforms standard prompt learning in zero-shot settings, achieving up to 19.7% absolute improvement and 13.8% average improvement. More surprisingly, on IMDB and SST-2, our approach even exceeds all few-shot baselines.

2023

pdf bib abs
Structure-aware Knowledge Graph-to-text Generation with Planning Selection and Similarity Distinction
Feng Zhao | Hongzhi Zou | Cheng Yan
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The knowledge graph-to-text (KG-to-text) generation task aims to synthesize coherent and engaging sentences that accurately convey the complex information derived from an input knowledge graph. One of the primary challenges in this task is bridging the gap between the diverse structures of the KG and the target text, while preserving the details of the input KG. To address this, we propose a novel approach that efficiently integrates graph structure-aware modules with pre-trained language models. Unlike conventional techniques, which only consider direct connections between first-order neighbors, our method delves deeper by incorporating Relative Distance Encoding as a bias within the graph structure-aware module. This enables our model to better capture the intricate topology information present in the KG. To further elevate the fidelity of the generated text, Planning Selection and Similarity Distinction are introduced. Our approach filters the most relevant linearized sequences by employing a planning scorer, while simultaneously distinguishing similar input KGs through contrastive learning techniques. Experiments on two datasets demonstrate the superiority of our model.

2021

Biomedical Concept Normalization (BCN) is widely used in biomedical text processing as a fundamental module. Owing to numerous surface variants of biomedical concepts, BCN still remains challenging and unsolved. In this paper, we exploit biomedical concept hypernyms to facilitate BCN. We propose Biomedical Concept Normalizer with Hypernyms (BCNH), a novel framework that adopts list-wise training to make use of both hypernyms and synonyms, and also employs norm constraint on the representation of hypernym-hyponym entity pairs. The experimental results show that BCNH outperforms the previous state-of-the-art model on the NCBI dataset.

Co-authors

Venues

Fix author