Xin Chen
Other people with similar names: Xin Chen, Xin Chen
Unverified author pages with similar names: Xin Chen
2026
MTR-Suite: A Framework for Evaluating and Synthesizing Conversational Retrieval Benchmarks
Junhao Ruan | Abudukeyumu Abudula | Bei Li | Yongjing Yin | Xinyu Liu | Kechen Jiao | Xin Chen | Jingang Wang | Xunliang Cai | Tong Xiao | JingBo Zhu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Junhao Ruan | Abudukeyumu Abudula | Bei Li | Yongjing Yin | Xinyu Liu | Kechen Jiao | Xin Chen | Jingang Wang | Xunliang Cai | Tong Xiao | JingBo Zhu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Accurate evaluation of conversational retrieval is pivotal for advancing Retrieval-Augmented Generation (RAG) systems. However, existing conversational retrieval benchmarks suffer from costly, sparse human annotation or rigid, unnatural automated heuristics. To address these challenges, we introduce MTR-Suite, a unified framework for auditing, synthesizing, and benchmarking retrieval. It features: (1) MTR-Eval, an LLM-based auditor quantifying alignment gaps in previous benchmarks; (2) MTR-Pipeline, a multi-agent system using greedy traversal clustering to generate high-fidelity dialogues at 1/400th human cost; and (3) MTR-Bench, a rigorous general-domain benchmark. MTR-Bench mimics production-style challenges (hard topic switching, verbosity), offering superior discriminative power. We make our code and data publicly available to facilitate future research.
BaseCal: Unsupervised Confidence Calibration via Base Model Signals
Hexiang Tan | Wanli Yang | Junwei Zhang | Xin Chen | Rui Tang | Du Su | Jingang Wang | Yuanzhuo Wang | Fei Sun | Xueqi Cheng
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hexiang Tan | Wanli Yang | Junwei Zhang | Xin Chen | Rui Tang | Du Su | Jingang Wang | Yuanzhuo Wang | Fei Sun | Xueqi Cheng
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reliable confidence is essential for trusting the outputs of LLMs, yet widely deployed post-trained LLMs (PoLLMs) typically compromise this trust with severe overconfidence. In contrast, we observe that their corresponding base LLMs often remain well-calibrated. This naturally motivates us to calibrate PoLLM confidence using the base LLM as a reference. This work proposes two ways to achieve this. A straightforward solution, BaseCal-ReEval, evaluates PoLLM’s responses by feeding them into the base LLM to get average probabilities as confidence. While effective, this approach introduces additional inference overhead. To address this, we propose BaseCal-Proj, which trains a lightweight projection to map the final-layer hidden states of PoLLMs back to those of their base LLMs. These projected states are then processed by the base LLM’s output layer to derive base-calibrated confidence for PoLLM’s responses. Notably, BaseCal is an unsupervised, plug-and-play solution that operates without human labels or LLM modifications. Experiments across five datasets and three LLM families demonstrate the effectiveness of BaseCal, reducing Expected Calibration Error (ECE) by an average of 42.90% compared to the best unsupervised baselines.
Steering Away from Refusal: A Black-box Jailbreak Method Based on First-Token Distribution
Shuangjie Fu | Du Su | Xin Chen | Fei Sun | Huawei Shen | Xueqi Cheng
Findings of the Association for Computational Linguistics: ACL 2026
Shuangjie Fu | Du Su | Xin Chen | Fei Sun | Huawei Shen | Xueqi Cheng
Findings of the Association for Computational Linguistics: ACL 2026
Investigating black-box jailbreak attacks is crucial for revealing the actual security risks faced by operational Large Language Models (LLMs). The primary challenge in black-box jailbreak attack is the absence of direct optimization signals, such as gradients, to guide the refinement of adversarial prompts. While current mainstream methods like PAIR and TAP attempt to leverage the model’s textual output as feedback, facing a critical limitation when models consistently generate static refusal responses, depriving the attacker of any actionable signal to distinguish better prompts. To overcome the bottleneck and reveal whether there is potential risk to open access to partial logprobs information, we investigate LLM output distribution. Our empirical analysis reveals that refusal responses exhibit a highly consistent distributional pattern at the first generated token, suggesting that the deviation from this standard pattern can serve as a quantifiable metric for LLM generating refusal response. Based on this insight, we propose Distribution Jailbreak (DJ), an attack method that select effective jailbreak templates and then iteratively optimizes adversarial suffixes by maximizing the KL divergence from the standard refusal distribution. Extensive experiments demonstrate that DJ achieves state-of-the-art Attack Success Rate(ASR). Notably, DJ achieves over 90% ASR on all tested open-source models, and delivers over 94% ASR on GPT-4.1. Our code is publicly available at https://github.com/Zed630/DistributionJailbreak.
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
Shiyu Liu | Yongjing Yin | Jianhao Yan | Yunbo Tang | Qinggang Zhang | Bei Li | Xin Chen | Jingang Wang | Xunliang Cai | Jinsong Su
Findings of the Association for Computational Linguistics: ACL 2026
Shiyu Liu | Yongjing Yin | Jianhao Yan | Yunbo Tang | Qinggang Zhang | Bei Li | Xin Chen | Jingang Wang | Xunliang Cai | Jinsong Su
Findings of the Association for Computational Linguistics: ACL 2026
RL-based agentic search enables LLMs to solve complex questions via dynamic planning and external search. While this approach significantly enhances accuracy with agent policies optimized via large-scale reinforcement learning, we identify a critical gap in reliability: these agents fail to recognize their reasoning boundaries and rarely admit "I DON’T KNOW" even when evidence is insufficient or reasoning reaches its limit. The lack of reliability often leads to plausible but unreliable answers, introducing significant risks in many real-world scenarios. To this end, we propose Boundary-Aware Policy Optimization (BAPO), a novel RL framework designed to cultivate reliable boundary awareness without compromising accuracy. BAPO introduces two key components: (i) a group-based boundary-aware reward that encourages an IDK response only when the reasoning reaches its limit, and (ii) an adaptive reward modulator that strategically suspends this reward during early exploration, preventing the model from exploiting IDK as a shortcut. Extensive experiments on four benchmarks demonstrate that BAPO substantially enhances the overall reliability of agentic search.
LANG: Reinforcement Learning for Multilingual Reasoning with Language-Adaptive Hint Guidance
Yuchun Fan | Bei Li | Peiguang Li | Yilin Wang | Yongyu Mu | Jian Yang | Xin Chen | Rongxiang Weng | Jingang Wang | Xunliang Cai | JingBo Zhu | Tong Xiao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yuchun Fan | Bei Li | Peiguang Li | Yilin Wang | Yongyu Mu | Jian Yang | Xin Chen | Rongxiang Weng | Jingang Wang | Xunliang Cai | JingBo Zhu | Tong Xiao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reinforcement learning has proven effective for enhancing multi-step reasoning in Large Language Models (LLMs), yet its benefits have not fully translated to multilingual contexts. Existing methods struggle with a fundamental trade-off: prioritizing input-language consistency severely hampers reasoning quality, while prioritizing reasoning often leads to unintended language drift toward English. We address this challenge with LANG, a novel framework that leverages language-conditioned hints to guide exploration in non-English reasoning tasks. Our method incorporates two key mechanisms to prevent dependency on these hints: a progressive decay schedule that gradually withdraws scaffolding, and a language-adaptive switch that tailors learning horizons to specific language difficulties. Empirical results on challenging multilingual mathematical benchmarks reveal that LANG substantially enhances reasoning performance without compromising language consistency. Moreover, we show that our framework generalizes beyond mathematics, fostering more consistent language alignment across model layers.
Search
Fix author
Co-authors
- Jingang Wang 4
- Xunliang Cai 3
- Bei Li 3
- Xueqi Cheng (程学旗) 2
- Du Su 2
- Fei Sun 2
- Tong Xiao (肖桐) 2
- Yongjing Yin 2
- JingBo Zhu (朱靖波) 2
- Abudukeyumu Abudula 1
- Yuchun Fan 1
- Shuangjie Fu 1
- Kechen Jiao 1
- Peiguang Li 1
- Xinyu Liu 1
- Shiyu Liu 1
- Yongyu Mu 1
- Junhao Ruan 1
- Huawei Shen (沈华伟) 1
- Jinsong Su 1
- Hexiang Tan 1
- Rui Tang 1
- Yunbo Tang 1
- Yuanzhuo Wang 1
- Yilin Wang 1
- Rongxiang Weng 1
- Jianhao Yan 1
- Wanli Yang 1
- Jian Yang 1
- Junwei Zhang 1
- Qinggang Zhang 1