Guibo Luo

2026

SeLaR: Selective Latent Reasoning in Large Language Models
Renyu Fu | Guibo Luo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Chain-of-Thought (CoT) has become a cornerstone of reasoning in large language models, yet its effectiveness is constrained by the limited expressiveness of discrete token sampling. Recent latent reasoning approaches attempt to alleviate this limitation by replacing discrete tokens with soft embeddings (probability-weighted mixtures of token embeddings) or hidden states, but they commonly suffer from two issues: (1) global activation injects perturbations into high-confidence steps, impairing reasoning stability; and (2) soft embeddings quickly collapse toward the highest-probability token, limiting exploration of alternative trajectories. To address these challenges, we propose SeLaR (Selective Latent Reasoning), a lightweight and training-free framework. SeLaR introduces an entropy-gated mechanism that activates soft embeddings only at low-confidence steps, while preserving discrete decoding at high-confidence steps. Additionally, we propose an entropy-aware contrastive regularization that pushes soft embeddings away from the highest-probability token’s direction, encouraging sustained exploration of multiple latent reasoning paths. Experiments on five reasoning benchmarks demonstrate that SeLaR consistently outperforms standard CoT and state-of-the-art training-free methods.

pdf bib abs

LEASH: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Yanhao Li | Lu Ma | Jiaran Zhang | Lexiang Tang | Wentao Zhang | Guibo Luo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large Language Models (LLMs) often produce unnecessarily lengthy reasoning traces, which significantly increase computational cost and latency. Existing approaches typically rely on fixed length penalties, but such penalties are hard to tune and fail to adapt to the evolving reasoning abilities of LLMs, leading to suboptimal trade-offs between accuracy and conciseness. To address this challenge, we propose **LEASH** (*adaptive LEngth penAlty and reward SHaping*), a reinforcement learning framework for efficient reasoning in LLMs. We formulate length control as a constrained optimization problem and employ a Lagrangian primal–dual method to dynamically adjust the penalty coefficient. When generations exceed the target length, the penalty is intensified; when they are shorter, it is relaxed. This adaptive mechanism guides models toward producing concise reasoning without sacrificing task performance. Experiments on Deepseek-R1-Distill-Qwen-1.5B and Qwen3-4B-Thinking-2507 show that LEASH reduces the average reasoning length by 60% across diverse tasks—including in-distribution mathematical reasoning and out-of-distribution domains such as coding and instruction following—while maintaining competitive performance. Our work thus presents a practical and effective paradigm for developing controllable and efficient LLMs that balance reasoning capabilities with computational budgets.

2025

pdf bib abs

As Large Language Models (LLMs) become more advanced, the security risks they pose also increase. Ensuring that LLM behavior aligns with human values, particularly in mitigating jailbreak attacks with elusive and implicit intentions, has become a significant challenge. To address this issue, we propose a jailbreak defense method called Real Intentions Defense (RID), which involves two phases: soft extraction and hard deletion. In the soft extraction phase, LLMs are leveraged to extract unbiased, genuine intentions, while in the hard deletion phase, a greedy gradient-based algorithm is used to remove the least important parts of a sentence, based on the insight that words with smaller gradients have less impact on its meaning. We conduct extensive experiments on Vicuna and Llama2 models using eight state-of-the-art jailbreak attacks and six benchmark datasets. Our results show a significant reduction in both Attack Success Rate (ASR) and Harmful Score of jailbreak attacks, while maintaining overall model performance. Further analysis sheds light on the underlying mechanisms of our approach.

Co-authors

Lu Ma 1

Venues

ACL2
COLING1

Fix author