Jiakai Wang
2025
Lexical Diversity-aware Relevance Assessment for Retrieval-Augmented Generation
Zhange Zhang
|
Yuqing Ma
|
Yulong Wang
|
Shan He
|
Tianbo Wang
|
Siqi He
|
Jiakai Wang
|
Xianglong Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-Augmented Generation (RAG) has proven effective in enhancing the factuality of LLMs’ generation, making them a focal point of research. However, previous RAG approaches overlook the lexical diversity of queries, hindering their ability to achieve a granular relevance assessment between queries and retrieved documents, resulting in suboptimal performance. In this paper, we introduce a Lexical Diversity-aware RAG (DRAG) method to address the biases in relevant information retrieval and utilization induced by lexical diversity. Specifically, a Diversity-sensitive Relevance Analyzer is proposed to decouple and assess the relevance of different query components (words, phrases) based on their levels of lexical diversity, ensuring precise and comprehensive document retrieval. Moreover, a Risk-guided Sparse Calibration strategy is further introduced to calibrate the generated tokens that is heavily affected by irrelevant content. Through these modules, DRAG is capable of effectively retrieving relevant documents and leverages their pertinent knowledge to refine the original results and generate meaningful outcomes. Extensive experiments on widely used benchmarks demonstrate the efficacy of our approach, yielding a 10.6% accuracy improvement on HotpotQA.
Token-Aware Editing of Internal Activations for Large Language Model Alignment
Tianbo Wang
|
Yuqing Ma
|
Kewei Liao
|
Chengzhao Yang
|
Zhange Zhang
|
Jiakai Wang
|
Xianglong Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Intervening the internal activations of large language models (LLMs) provides an effective inference-time alignment approach to mitigate undesirable behaviors, such as generating erroneous or harmful content, thereby ensuring safe and reliable applications of LLMs. However, previous methods neglect the misalignment discrepancy among varied tokens, resulting in deviant alignment direction and inflexible editing strength. To address these issues, we propose a token-aware editing (TAE) approach to fully utilize token-level alignment information in the activation space, therefore realizing superior post-intervention performance. Specifically, a Mutual Information-guided Graph Aggregation (MIG) module first develops an MI-guided graph to exploit the tokens’ informative interaction for activation enrichment, thus improving alignment probing and facilitating intervention. Subsequently, Misalignment-aware Adaptive Intervention (MAI) comprehensively perceives the token-level misalignment degree from token representation and prediction to guide the adaptive adjustment of editing strength, thereby enhancing final alignment performance. Extensive experiments on three alignment capabilities demonstrate the efficacy of TAE, notably surpassing baseline by 25.8% on the primary metric of truthfulness with minimal cost.
Search
Fix author
Co-authors
- Xianglong Liu 2
- Yuqing Ma 2
- Tianbo Wang 2
- Zhange Zhang 2
- Shan He 1
- show all...