2025
pdf
bib
abs
Exploiting Contextual Knowledge in LLMs through 𝒱-usable Information based Layer Enhancement
Xiaowei Yuan
|
Zhao Yang
|
Ziyang Huang
|
Yequan Wang
|
Siqi Fan
|
Yiming Ju
|
Jun Zhao
|
Kang Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks, yet they often struggle with context-faithfulness generations that properly reflect contextual knowledge. While existing approaches focus on enhancing the decoding strategies, they ignore the fundamental mechanism of how contextual information is processed within LLMs’ internal states. As a result, LLMs remain limited in their ability to fully leverage contextual knowledge. In this paper, we propose Context-aware Layer Enhancement (CaLE), a novel intervention method that enhances the utilization of contextual knowledge within LLMs’ internal representations. By employing 𝒱-usable information analysis, CaLE strategically amplifies the growth of contextual information at an optimal layer, thereby enriching representations in the final layer. Our experiments demonstrate that CaLE effectively improves context-faithful generation in Question-Answering tasks, particularly in scenarios involving unknown or conflicting contextual knowledge.
pdf
bib
abs
Towards Adaptive Mechanism Activation in Language Agent
Ziyang Huang
|
Jun Zhao
|
Kang Liu
Proceedings of the 31st International Conference on Computational Linguistics
Language Agent could be endowed with different mechanisms for autonomous task accomplishment. Current agents typically rely on a fixed mechanism or a set of mechanisms activated in a predefined order, limiting their adaptation to varied potential task solution structures. To this end, this paper proposes Adaptive Language Agent Mechanism Activation Learning with Self-Exploration (ALAMA), which focuses on optimizing mechanism activation adaptability without reliance on expert models. Initially, it builds a harmonized agent framework (UniAct) to Unify different mechanisms via Actions. Then it leverages a training-efficient optimization method based on self-exploration to enable the UniAct to adaptively activate the appropriate mechanisms according to the potential characteristics of the task. Experimental results demonstrate significant improvements in downstream agent tasks, affirming the effectiveness of our approach in facilitating more dynamic and context-sensitive mechanism activation.
pdf
bib
abs
KMatrix-2: A Comprehensive Heterogeneous Knowledge Collaborative Enhancement Toolkit for Large Language Model
Shun Wu
|
Di Wu
|
Wangtao Sun
|
Ziyang Huang
|
Xiaowei Yuan
|
Kun Luo
|
XueYou Zhang
|
Shizhu He
|
Jun Zhao
|
Kang Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The paper presents KMatrix-2, an open-source toolkit that supports comprehensive heterogeneous knowledge collaborative enhancement for Large Language Models (LLMs). As the successor of KMatrix, our toolkit offers powerful modular components and typical enhancement patterns for convenient construction of mainstream knowledge-enhanced LLMs systems. Besides, it provides unified knowledge integration and joint knowledge retrieval methods to achieve more comprehensive heterogeneous knowledge collaborative enhancement. Compared with KMatrix which mainly focuses on descriptive knowledge, this work additionally considers procedural knowledge. Moreover, systematic inter-context and context-memory knowledge conflict resolution methods are offered for better knowledge integration. Some key research questions in heterogeneous knowledge-enhanced Large Language Models systems are analyzed, and our toolkit’s capability in building such systems is validated.
pdf
bib
abs
Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate
Ziyang Huang
|
Wangtao Sun
|
Jun Zhao
|
Kang Liu
Findings of the Association for Computational Linguistics: ACL 2025
This paper systematically addresses the challenge of rule retrieval, a crucial yet underexplored area. Vanilla retrieval methods using sparse or dense retrievers to directly search for relevant rules to support downstream reasoning, often suffer from low accuracy. This is primarily due to a significant semantic gap between the instantiated facts in the queries and the abstract representations of the rules. Such misalignment results in suboptimal retrieval quality, which in turn negatively impacts reasoning performance. To overcome these challenges, we propose Self-Induction Augmented Retrieval (SIAR), a novel approach that utilizes Large Language Models (LLMs) to induce potential inferential rules that might offer benefits for reasoning by abstracting the underlying knowledge and logical structure in queries. These induced rules are then used for query augmentation to improve retrieval effectiveness. Additionally, we introduce Rule Relevance ReEstimate (R3), a method that re-estimates the relevance of retrieved rules by assessing whether the abstract knowledge they contain can be instantiated to align with the facts in the queries and the helpfulness for reasoning. Extensive experiments across various settings demonstrate the effectiveness and versatility of our proposed methods.
2023
pdf
bib
abs
DiffusionSL: Sequence Labeling via Tag Diffusion Process
Ziyang Huang
|
Pengfei Cao
|
Jun Zhao
|
Kang Liu
Findings of the Association for Computational Linguistics: EMNLP 2023
Sequence Labeling (SL) is long-standing in Natural Language Processing (NLP). Traditionally, discriminative models have been widely used to capture the conditional distribution of sequence tags, rather than generative models. In this paper, we present DiffusionSL, a framework that utilizes a conditional discrete diffusion model for generating discrete tag data, resulting in a Tag Diffusion Process. We treat the natural language sequence as the conditional signal and the sequence tags as the generation target, iteratively refining the noisy tags to obtain clean ones. To address the discreteness issue, we propose the Bit-Tag Converter (BTConverter) to model the target in continuous data space. Furthermore, we introduce the Bit Diffusion Transformer (BitDiT) to model the process of noise elimination. Leveraging the powerful iterative refinement capability of the diffusion model, DiffusionSL achieves superior performance against previous state-of-the-art (SOTA) baselines and outperforms gpt-3.5-turbo significantly across multiple benchmark datasets and various tasks.