Sendong Zhao
2026
When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
Boyu Xiao | Xiuqi Tian | Xuwen Song | Haochun Wang | Guanchun Song | Sendong Zhao | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Boyu Xiao | Xiuqi Tian | Xuwen Song | Haochun Wang | Guanchun Song | Sendong Zhao | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Despite strong medical benchmark accuracy, LLMs can exhibit severe multi-turn sycophancy in clinical dialogue, abandoning initial correct diagnosis under escalating pressure. We propose Med-Stress, a targeted stress test framework that evaluates belief stability under escalating pressure. Across nine frontier large language models (LLMs), we find a clear dissociation between medical knowledge and robustness: high initial diagnostic capability does not imply high belief stability, yielding large knowledge-robustness gaps for several LLMs. To mitigate this failure mode, we propose a lightweight inference-time defense, RBED (Role-Based Epistemic Defense), and R-FT (Resilience-oriented Fine-Tuning), a training-time approach that internalizes evidence-based resistance to pressure. Experiments show that R-FT nearly eliminates belief change and substantially improves robustness.
Collaborative Chain-of-Agents for Parametric-Retrieved Knowledge Synergy
Yi Jiang | Sendong Zhao | Jianbo Li | Haochun Wang | Lizhe Zhang | Yan Liu | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yi Jiang | Sendong Zhao | Jianbo Li | Haochun Wang | Lizhe Zhang | Yan Liu | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs), especially for knowledge-intensive tasks. Despite its advantages, current RAG methods often struggle to fully exploit knowledge during generation. In particular, the synergy between the model’s internal parametric knowledge and external retrieved knowledge remains limited. Retrieved contents may sometimes mislead generation, while certain generated content can guide the model toward more accurate outputs. In this work, we propose Collaborative Chain-of-Agents, a framework designed to enhance explicitly synergy over both parametric and retrieved knowledge. Specifically, we first introduce CoCoA-zero, a multi-agent RAG framework that first performs conditional knowledge induction and then reasons answers. Building on this, we develop CoCoA, a long-chain training strategy that synthesizes extended multi-agent reasoning trajectories from CoCoA-zero to fine-tune the LLM. This strategy enhances the model’s capability to explicitly integrate and jointly leverage parametric and retrieved knowledge. Experimental results demonstrate the superiority of CoCoA in open-domain QA and multi-hop QA.
Toward Secure Tuning: Mitigating Security Risks from Instruction Fine-Tuning
Yanrui Du | Fenglei Fan | Sendong Zhao | Jiawei Cao | Ming Ma | Danyang Zhao | Shuren Qi | Ting Liu | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yanrui Du | Fenglei Fan | Sendong Zhao | Jiawei Cao | Ming Ma | Danyang Zhao | Shuren Qi | Ting Liu | Bing Qin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Instruction Fine-Tuning (IFT) has emerged as a critical technique for customizing Large Language Models (LLMs) to meet diverse downstream applications. However, recent studies have revealed that IFT can compromise the built-in security mechanisms of LLMs, thereby posing significant security risks. Although defense methods targeting various training stages have been proposed, they either face challenges in practical deployment or exhibit instability and limited performance gains. In our study, we propose a novel SWAT method that introduces a key idea: shifting more of the learning burden onto security-robust parameters. To this end, our study investigates how module-level parameters affect LLMs’ internal security feature space, aiming to uncover robustness patterns in parameters. Guided by this analysis, we identify a robust module set (Mods_Rob) that exhibits minimal effects on LLMs’ security feature space. Leveraging this insight, SWAT proceeds in two phases: (1) a warm-up phase that preferentially trains Mods_Rob to learn low-level features with minimal security risk, followed by (2) standard tuning to achieve optimal task performance. Across diverse knowledge-intensive datasets, scenarios, and LLMs, SWAT substantially reduces security risks without sacrificing task performance gains.
2025
LLMs May Perform MCQA by Selecting the Least Incorrect Option
Haochun Wang | Sendong Zhao | Zewen Qiang | Nuwa Xi | Bing Qin | Ting Liu
Proceedings of the 31st International Conference on Computational Linguistics
Haochun Wang | Sendong Zhao | Zewen Qiang | Nuwa Xi | Bing Qin | Ting Liu
Proceedings of the 31st International Conference on Computational Linguistics
In the field of NLP, Large Language Models (LLMs) have markedly enhanced performance across a variety of tasks. However, the comprehensive evaluation of LLMs remains an inevitable challenge for the community. Recently, the adoption of Multiple Choice Question Answering (MCQA) as a benchmark for assessing LLMs has gained considerable traction. However, concerns regarding the robustness of this evaluative method persist. Building upon previous discussions on the issue of variability, we reveal an additional dimension of concern: LLMs may perform MCQA by selecting the least incorrect option rather than distinctly correct. This observation suggests that LLMs might regard multiple options as correct, which could undermine the reliability of MCQA as a metric for evaluating LLMs. To address this challenge, we introduce an enhanced dataset augmentation method for MCQA, termed MCQA+, to provide a more accurate reflection of the performance, thereby highlighting the necessity for more sophisticated evaluation mechanisms in the assessment of LLM capabilities.
GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis
Yi Jiang | Sendong Zhao | Jianbo Li | Haochun Wang | Bing Qin
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yi Jiang | Sendong Zhao | Jianbo Li | Haochun Wang | Bing Qin
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamicaslly inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose **GainRAG**, a novel approach that aligns the retriever’s and LLM’s preferences by defining a new metric, “gain’’, which measure how well an input passage contributes to correct outputs.We then propose a method to estimate these gain signals and train a middleware that aligns the preferences of the retriever and the LLM using only limited data.In addition, we introduce a pseudo-passage strategy to mitigate degradation.The experimental results on 6 datasets verify the effectiveness of GainRAG.
ReLearn: Unlearning via Learning for Large Language Models
Haoming Xu | Ningyuan Zhao | Liming Yang | Sendong Zhao | Shumin Deng | Mengru Wang | Bryan Hooi | Nay Oo | Huajun Chen | Ningyu Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haoming Xu | Ningyuan Zhao | Liming Yang | Sendong Zhao | Shumin Deng | Mengru Wang | Bryan Hooi | Nay Oo | Huajun Chen | Ningyu Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Current unlearning methods for large language models usually rely on reverse optimization to reduce target token probabilities. However, this paradigm disrupts the subsequent tokens prediction, degrading model performance and linguistic coherence. Moreover, existing evaluation metrics overemphasize contextual forgetting while inadequately assessing response fluency and relevance. To address these challenges, we propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning, along with a comprehensive evaluation framework. This framework introduces Knowledge Forgetting Ratio (KFR) and Knowledge Retention Ratio (KRR) to measure knowledge-level preservation, and Linguistic Score (LS) to evaluate generation quality. Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality outputs. Through mechanistic analysis, we further demonstrate how reverse optimization disrupts coherent text generation, while ReLearn preserves this essential capability.
Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems
Haochun Wang | Sendong Zhao | Jingbo Wang | Zewen Qiang | Bing Qin | Ting Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haochun Wang | Sendong Zhao | Jingbo Wang | Zewen Qiang | Bing Qin | Ting Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Multi-agent collaboration has emerged as a pivotal paradigm for addressing complex, distributed tasks in large language model (LLM)-driven applications. While prior research has focused on high-level architectural frameworks, the granular mechanisms governing agents—critical to performance and scalability—remain underexplored. This study systematically investigates four dimensions of collaboration strategies: (1) agent governance, (2) participation control, (3) interaction dynamics, and (4) dialogue history management. Through rigorous experimentation under two context-dependent scenarios—Distributed Evidence Integration (DEI) and Structured Evidence Synthesis (SES)—we quantify the impact of these strategies on both task accuracy and computational efficiency. Our findings reveal that centralized governance, instructor-led participation, ordered interaction patterns, and instructor-curated context summarization collectively optimize the trade-off between decision quality and resource utilization with the support of the proposed Token-Accuracy Ratio (TAR). This work establishes a foundation for designing adaptive, scalable multi-agent systems, shifting the focus from structural novelty to strategic interaction mechanics.
2024
AS-ES Learning: Towards efficient CoT learning in small models
Nuwa Xi | Yuhan Chen | Sendong Zhao | Haochun Wang | GongZhang GongZhang | Bing Qin | Ting Liu
Findings of the Association for Computational Linguistics: ACL 2024
Nuwa Xi | Yuhan Chen | Sendong Zhao | Haochun Wang | GongZhang GongZhang | Bing Qin | Ting Liu
Findings of the Association for Computational Linguistics: ACL 2024
Chain-of-Thought (CoT) serves as a critical emerging ability in LLMs, especially when it comes to logical reasoning. Attempts have been made to induce such ability in small models as well by distilling from the data with CoT generated by Large Language Models (LLMs). However, existing methods often simply generate and incorporate more data from LLMs and fail to note the importance of efficiently utilizing existing CoT data. We here propose a new training paradigm AS-ES (Abstractive Segments - Extractive Segments) learning, which exploits the inherent information in CoT for iterative generation. Experiments show that our methods surpass the direct seq2seq training on CoT-extensive tasks like MWP and PET summarization, without data augmentation or altering the model itself. Furthermore, we explore the reason behind the inefficiency of small models in learning CoT and provide an explanation of why AS-ES learning works, giving insights into the underlying mechanism of CoT.
2023
Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making
Yanrui Du | Sendong Zhao | Haochun Wang | Yuhan Chen | Rui Bai | Zewen Qiang | Muzhen Cai | Bing Qin
Findings of the Association for Computational Linguistics: EMNLP 2023
Yanrui Du | Sendong Zhao | Haochun Wang | Yuhan Chen | Rui Bai | Zewen Qiang | Muzhen Cai | Bing Qin
Findings of the Association for Computational Linguistics: EMNLP 2023
Explaining black-box model behavior with natural language has achieved impressive results in various NLP tasks. Recent research has explored the utilization of subsequences from the input text as a rationale, providing users with evidence to support the model decision. Although existing frameworks excel in generating high-quality rationales while achieving high task performance, they neglect to account for the unreliable link between the generated rationale and model decision. In simpler terms, a model may make correct decisions while attributing wrong rationales, or make poor decisions while attributing correct rationales. To mitigate this issue, we propose a unified two-stage framework known as Self-Attribution and Decision-Making (SADM). Through extensive experiments on five reasoning datasets from the ERASER benchmark, we demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale. Furthermore, we explore the potential of our framework in semi-supervised scenarios.
UniCoRN: Unified Cognitive Signal ReconstructioN bridging cognitive signals and human language
Nuwa Xi | Sendong Zhao | Haochun Wang | Chi Liu | Bing Qin | Ting Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Nuwa Xi | Sendong Zhao | Haochun Wang | Chi Liu | Bing Qin | Ting Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Decoding text stimuli from cognitive signals (e.g. fMRI) enhances our understanding of the human language system, paving the way for building versatile Brain-Computer Interface. However, existing studies largely focus on decoding individual word-level fMRI volumes from a restricted vocabulary, which is far too idealized for real-world application. In this paper, we propose fMRI2text, the first open-vocabulary task aiming to bridge fMRI time series and human language. Furthermore, to explore the potential of this new task, we present a baseline solution, UniCoRN: the Unified Cognitive Signal ReconstructioN for Brain Decoding. By reconstructing both individual time points and time series, UniCoRN establishes a robust encoder for cognitive signals (fMRI & EEG). Leveraging a pre-trained language model as decoder, UniCoRN proves its efficacy in decoding coherent text from fMRI series across various split settings. Our model achieves a 34.77% BLEU score on fMRI2text, and a 37.04% BLEU when generalized to EEG-to-text decoding, thereby surpassing the former baseline. Experimental results indicate the feasibility of decoding consecutive fMRI volumes, and the effectiveness of decoding different cognitive signals using a unified structure.
2022
Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words
Haochun Wang | Chi Liu | Nuwa Xi | Sendong Zhao | Meizhi Ju | Shiwei Zhang | Ziheng Zhang | Yefeng Zheng | Bing Qin | Ting Liu
Proceedings of the 29th International Conference on Computational Linguistics
Haochun Wang | Chi Liu | Nuwa Xi | Sendong Zhao | Meizhi Ju | Shiwei Zhang | Ziheng Zhang | Yefeng Zheng | Bing Qin | Ting Liu
Proceedings of the 29th International Conference on Computational Linguistics
Prompt-based fine-tuning for pre-trained models has proven effective for many natural language processing tasks under few-shot settings in general domain. However, tuning with prompt in biomedical domain has not been investigated thoroughly. Biomedical words are often rare in general domain, but quite ubiquitous in biomedical contexts, which dramatically deteriorates the performance of pre-trained models on downstream biomedical applications even after fine-tuning, especially in low-resource scenarios. We propose a simple yet effective approach to helping models learn rare biomedical words during tuning with prompt. Experimental results show that our method can achieve up to 6% improvement in biomedical natural language inference task without any extra parameters or training steps using few-shot vanilla prompt settings.
2021
Less Is More: Domain Adaptation with Lottery Ticket for Reading Comprehension
Haichao Zhu | Zekun Wang | Heng Zhang | Ming Liu | Sendong Zhao | Bing Qin
Findings of the Association for Computational Linguistics: EMNLP 2021
Haichao Zhu | Zekun Wang | Heng Zhang | Ming Liu | Sendong Zhao | Bing Qin
Findings of the Association for Computational Linguistics: EMNLP 2021
In this paper, we propose a simple few-shot domain adaptation paradigm for reading comprehension. We first identify the lottery subnetwork structure within the Transformer-based source domain model via gradual magnitude pruning. Then, we only fine-tune the lottery subnetwork, a small fraction of the whole parameters, on the annotated target domain data for adaptation. To obtain more adaptable subnetworks, we introduce self-attention attribution to weigh parameters, beyond simply pruning the smallest magnitude parameters, which can be seen as combining structured pruning and unstructured magnitude pruning softly. Experimental results show that our method outperforms the full model fine-tuning adaptation on four out of five domains when only a small amount of annotated data available for adaptation. Moreover, introducing self-attention attribution reserves more parameters for important attention heads in the lottery subnetwork and improves the target domain model performance. Our further analyses reveal that, besides exploiting fewer parameters, the choice of subnetworks is critical to the effectiveness.
Search
Fix author
Co-authors
- Bing Qin (秦兵) 11
- Haochun Wang 9
- Ting Liu 5
- Nuwa Xi 4
- Zewen Qiang 3
- Yuhan Chen 2
- Yanrui Du 2
- Yi Jiang 2
- Jianbo Li 2
- Chi Liu 2
- Rui Bai 1
- Muzhen Cai 1
- Jiawei Cao 1
- Huajun Chen 1
- Shumin Deng 1
- Fenglei Fan 1
- GongZhang GongZhang 1
- Bryan Hooi 1
- Meizhi Ju 1
- Ming Liu 1
- Ting Liu 1
- Yan Liu 1
- Ming Ma 1
- Nay Oo 1
- Shuren Qi 1
- Guanchun Song 1
- Xuwen Song 1
- Xiuqi Tian 1
- Jingbo Wang 1
- Mengru Wang 1
- Zekun Wang 1
- Boyu Xiao 1
- Haoming Xu 1
- Liming Yang 1
- Heng Zhang 1
- Lizhe Zhang 1
- Ningyu Zhang 1
- Shiwei Zhang 1
- Ziheng Zhang 1
- Danyang Zhao 1
- Ningyuan Zhao 1
- Yefeng Zheng 1
- Haichao Zhu 1