2025
pdf
bib
abs
LLMs Trust Humans More, That’s a Problem! Unveiling and Mitigating the Authority Bias in Retrieval-Augmented Generation
Yuxuan Li
|
Xinwei Guo
|
Jiashi Gao
|
Guanhua Chen
|
Xiangyu Zhao
|
Jiaxin Zhang
|
Quanying Liu
|
Haiyan Wu
|
Xin Yao
|
Xuetao Wei
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-Augmented Generation (RAG) has been proven to be an effective approach to address the hallucination problem in large language models (LLMs). In current RAG systems, LLMs typically need to synthesize knowledge provided by two main external sources (user prompts and an external database) to generate a final answer. When the knowledge provided by the user conflicts with that retrieved from the database, a critical question arises: Does the LLM favor one knowledge source over the other when generating the answer? In this paper, we are the first to unveil a new phenomenon, Authority Bias, where the LLMs tend to favor the knowledge provided by the user even when it deviates from the facts; this new phenomenon is rigorously evidenced via our novel and comprehensive characterization of Authority Bias in six widely used LLMs and across diverse task scenarios. We propose a novel dataset specifically designed for detecting Authority Bias, called the Authority Bias Detection Dataset (ABDD), and introduce new, detailed metrics to measure Authority Bias. To mitigate Authority bias, we finally propose the Conflict Detection Enhanced Query (CDEQ) framework. We identify the sentences and atomic information that generate conflicts, perform a credibility assessment on the conflicting paragraphs, and ultimately enhance the query to detect perturbed text, thereby reducing Authority bias. Comparative experiments with widely used mitigation methods demonstrate that CDEQ exhibits both effectiveness and advancement, significantly enhancing the robustness of RAG systems.
pdf
bib
abs
Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
Zeyuan Chen
|
Haiyan Wu
|
Kaixin Wu
|
Wei Chen
|
Mingjie Zhong
|
Jia Xu
|
Zhongyi Liu
|
Wei Zhang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
This paper studies the relevance modeling problem by integrating world knowledge stored in the parameters of LLMs with specialized domain knowledge represented by user behavior data for achieving promising performance. The novel framework ProRBP is proposed, which innovatively develops user-driven behavior neighbor retrieval module to learn domain-specific knowledge in time and introduces progressive prompting and aggregation module for considering diverse aspects of the relevance and prediction stability. We explore an industrial implementation to deploy LLMs to handle full-scale search traffics of Alipay with acceptable cost and latency. The comprehensive experiments on real-world industry data and online A/B testing validate the superiority of our proposal and the effectiveness of its main modules.
pdf
bib
abs
The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing
Xinwei Guo
|
Jiashi Gao
|
Junlei Zhou
|
Jiaxin Zhang
|
Guanhua Chen
|
Xiangyu Zhao
|
Quanying Liu
|
Haiyan Wu
|
Xin Yao
|
Xuetao Wei
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Models (LLMs) are increasingly integrated into our daily lives, raising significant ethical concerns, especially about perpetuating stereotypes.While group-specific debiasing methods have made progress, they often fail to address multiple biases simultaneously. In contrast, group-agnostic debiasing has the potential to mitigate a variety of biases at once, but remains underexplored.In this work, we investigate the role of neutral words—the group-agnostic component—in enhancing the group-agnostic debiasing process. We first reveal that neutral words are essential for preserving semantic modeling, and we propose 𝜖-DPCE, a method that incorporates a neutral word semantics-based loss function to effectively alleviate the deterioration of the Language Modeling Score (LMS) during the debiasing process. Furthermore, by introducing the SCM-Projection method, we demonstrate that SCM-based debiasing eliminates stereotypes by indirectly disrupting the association between attribute and neutral words in the Stereotype Content Model (SCM) space. Our experiments show that neutral words, which often embed multi-group stereotypical objects, play a key role in contributing to the group-agnostic nature of SCM-based debiasing.
2020
pdf
bib
abs
Modularized Syntactic Neural Networks for Sentence Classification
Haiyan Wu
|
Ying Liu
|
Shaoyun Shi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
This paper focuses on tree-based modeling for the sentence classification task. In existing works, aggregating on a syntax tree usually considers local information of sub-trees. In contrast, in addition to the local information, our proposed Modularized Syntactic Neural Network (MSNN) utilizes the syntax category labels and takes advantage of the global context while modeling sub-trees. In MSNN, each node of a syntax tree is modeled by a label-related syntax module. Each syntax module aggregates the outputs of lower-level modules, and finally, the root module provides the sentence representation. We design a tree-parallel mini-batch strategy for efficient training and predicting. Experimental results on four benchmark datasets show that our MSNN significantly outperforms previous state-of-the-art tree-based methods on the sentence classification task.