Yang Shen


2025

pdf bib
Whether LLMs Know If They Know: Identifying Knowledge Boundaries via Debiased Historical In-Context Learning
Bo Lv | Nayu Liu | Yang Shen | Xin Liu | Ping Luo | Yue Yu
Findings of the Association for Computational Linguistics: ACL 2025

In active retrieval (AR), large language models (LLMs) need first assess whether they possess knowledge to answer a given query, to decide whether to invoke a retrieval module. Existing methods primarily rely on training classification models or using the confidence of the model’s answer to determine knowledge boundaries. However, training-based methods may have limited generalization, and our analysis reveals that LLMs struggle to reliably assess whether they possess the required information based on their answers, often biased by prior cognitive tendencies (e.g., tokens’ semantic preferences). To address this, we propose Debiased Historical In-Context Learning (DH-ICL) to identify knowledge boundaries in AR. DH-ICL aims to reframe this self-awareness metacognitive task as a structured pattern-learning problem by retrieving similar historical queries as high-confidence in-context examples to guide LLMs to identify knowledge boundaries. Furthermore, we introduce a historical bias calibration strategy that leverages deviations in the model’s past response logits to mitigate cognitive biases in its current knowledge boundary assessment. Experiments on four QA benchmarks show that DH-ICL achieves performance comparable to full retrieval on LLaMA with only half the number of retrievals, without any additional training.

2013

pdf bib
Chinese Short Text Classification Based on Domain Knowledge
Xiao Feng | Yang Shen | Chengyong Liu | Wei Liang | Shuwu Zhang
Proceedings of the Sixth International Joint Conference on Natural Language Processing