Jiaheng Zhang
2026
Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models
Jingwei Ni | Ekaterina Fadeeva | Tianyi Wu | Mubashara Akhtar | Jiaheng Zhang | Elliott Ash | Markus Leippold | Timothy Baldwin | See-Kiong Ng | Artem Shelmanov | Mrinmaya Sachan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jingwei Ni | Ekaterina Fadeeva | Tianyi Wu | Mubashara Akhtar | Jiaheng Zhang | Elliott Ash | Markus Leippold | Timothy Baldwin | See-Kiong Ng | Artem Shelmanov | Mrinmaya Sachan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
LLMs can solve complex tasks by generating long, multi-step reasoning chains. Test-time scaling (TTS) can further improve LLM performance by sampling multiple variants of intermediate reasoning steps, verifying their correctness, and strategically choosing the best steps for continuation. However, existing verification approaches, such as Process Reward Models (PRMs), are computationally expensive, limited to specific domains, and require large-scale human or model-generated annotations. We propose a lightweight alternative for step-level reasoning verification based on probing the internal states of LLMs. We train a transformer-based probe that uses the internal states of the frozen LLM to estimate the credibility of its reasoning steps during generation. Annotation can be generated either by another larger LLM (e.g., DeepSeek-R1) or in a self-supervised manner by the original model itself. The probes are both effective and lightweight, containing fewer than 10M parameters. Across multiple domains, including mathematics, planning, and general knowledge question answering, our probes match or even exceed the performance of PRMs that are up to 810× larger. Our findings suggest that the internal states of LLMs encode their confidence in reasoning processes and can serve as reliable signals for reasoning step verification, offering a promising direction towards scalable and generalizable TTS and introspective LLMs.
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
Linhao Zhong | Linyu Wu | Bozhen Fang | Tianjian Feng | Chenchen Jing | Wen Wang | Jiaheng Zhang | Hao Chen | Chunhua Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Linhao Zhong | Linyu Wu | Bozhen Fang | Tianjian Feng | Chenchen Jing | Wen Wang | Jiaheng Zhang | Hao Chen | Chunhua Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Diffusion Language Models (DLMs) offer a promising alternative for language modeling by enabling parallel decoding through iterative refinement. However, most DLMs rely on hard binary masking and discrete token assignments, which hinder the revision of early decisions and underutilize intermediate probabilistic representations. In this paper, we propose EvoToken-DLM, a novel diffusion-based language modeling approach that replaces hard binary masks with evolving soft token distributions. EvoToken-DLM enables a progressive transition from masked states to discrete outputs, supporting revisable decoding. To effectively support this evolution, we introduce continuous trajectory supervision, which aligns training objectives with iterative probabilistic updates. Extensive experiments across multiple benchmarks show that EvoToken-DLM consistently achieves superior performance, outperforming strong diffusion-based and masked DLM baselines. Our code is available at https://github.com/aim-uofa/EvoTokenDLM.
Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration
Linhao Zhong | Linyu Wu | Wen Wang | Yuling Xi | Chenchen Jing | Jiaheng Zhang | Hao Chen | Chunhua Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Linhao Zhong | Linyu Wu | Wen Wang | Yuling Xi | Chenchen Jing | Jiaheng Zhang | Hao Chen | Chunhua Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Diffusion large language models (dLLMs) have recently attracted significant attention for their ability to enhance diversity, controllability, and parallelism. However, their non-sequential, bidirectionally masked generation makes quality assessment difficult, underscoring the need for effective self-evaluation. In this work, we propose DiSE, a simple yet effective self-evaluation confidence quantification method for dLLMs. DiSE quantifies confidence by computing the probability of regenerating the tokens in the entire generated sequence, given the full context. This method enables more efficient and reliable quality assessment by leveraging token regeneration probabilities, facilitating both likelihood estimation and robust uncertainty quantification. Building upon DiSE, we further introduce a flexible-length generation framework, which adaptively controls the sequence length based on the model’s self-assessment of its own output. We analyze and validate the feasibility of DiSE from the perspective of dLLM generalization, and empirically demonstrate that DiSE is positively correlated with both semantic coherence and answer accuracy. Extensive experiments on likelihood evaluation, uncertainty quantification, and flexible-length generation further confirm the effectiveness of the proposed DiSE.
2025
Safety in Large Reasoning Models: A Survey
Cheng Wang | Yue Liu | Baolong Bi | Duzhen Zhang | Zhong-Zhi Li | Yingwei Ma | Yufei He | Shengju Yu | Xinfeng Li | Junfeng Fang | Jiaheng Zhang | Bryan Hooi
Findings of the Association for Computational Linguistics: EMNLP 2025
Cheng Wang | Yue Liu | Baolong Bi | Duzhen Zhang | Zhong-Zhi Li | Yingwei Ma | Yufei He | Shengju Yu | Xinfeng Li | Junfeng Fang | Jiaheng Zhang | Bryan Hooi
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Reasoning Models (LRMs) have exhibited extraordinary prowess in tasks like mathematics and coding, leveraging their advanced reasoning capabilities. Nevertheless, as these capabilities progress, significant concerns regarding their vulnerabilities and safety have arisen, which can pose challenges to their deployment and application in real-world settings. This paper presents the first comprehensive survey of LRMs, meticulously exploring and summarizing the newly emerged safety risks, attacks, and defense strategies specific to these powerful reasoning-enhanced models. By organizing these elements into a detailed taxonomy, this work aims to offer a clear and structured understanding of the current safety landscape of LRMs, facilitating future research and development to enhance the security and reliability of these powerful models.
Search
Fix author
Co-authors
- Hao Chen 2
- Chenchen Jing 2
- Chunhua Shen 2
- Wen Wang 2
- Linyu Wu 2
- Linhao Zhong 2
- Mubashara Akhtar 1
- Elliott Ash 1
- Timothy Baldwin 1
- Baolong Bi 1
- Ekaterina Fadeeva 1
- Bozhen Fang 1
- Junfeng Fang 1
- Tianjian Feng 1
- Yufei He 1
- Bryan Hooi 1
- Markus Leippold 1
- Xinfeng Li 1
- Zhong-Zhi Li 1
- Yue Liu 1
- Yingwei MA 1
- See Kiong Ng 1
- Jingwei Ni 1
- Mrinmaya Sachan 1
- Artem Shelmanov 1
- Cheng Wang 1
- Tianyi Wu 1
- Yuling Xi 1
- Shengju Yu 1
- Duzhen Zhang 1