Ada-RS: Adaptive Rejection Sampling for Selective Thinking

Yirou Ge, Yixi Li, Alec M. Chiu, Shivani Shekhar, Zijie Pan, Avinash Thangali, Yun-Shiuan Chuang, Chaitanya Kulkarni, Uma Kona, Linsey Pang, Prakhar Mehrotra


Abstract
Large language models (LLMs) are increasingly being deployed in cost- and latency-sensitive settings. While chain-of-thought improves reasoning, it can waste tokens on simple requests. We study selective thinking for tool-using LLMs and introduce Adaptive Rejection Sampling (Ada-RS), an algorithm-agnostic sample filtering framework for learning selective and efficient reasoning. For each given context, Ada-RS scores multiple sampled completions with an adaptive length-penalized reward then applies stochastic rejection sampling to retain only high-reward candidates (or preference pairs) for downstream optimization. We demonstrate how Ada-RS plugs into both preference pair (e.g. DPO) or grouped policy optimization strategies (e.g. DAPO). Using Qwen3-8B with LoRA on a synthetic tool call-oriented e-commerce benchmark, Ada-RS improves the accuracy-efficiency frontier over standard algorithms by reducing average output tokens by up to ∼80% and reducing thinking rate by up to ∼95% while maintaining or improving tool call accuracy. We further demonstrate that these gains generalize across model scales (Qwen3-1.7B, 8B, 14B) and domains (τ 2-Bench airline and telecom). These results highlight that training signal selection is a powerful lever for efficient reasoning in latency-sensitive deployments.
Anthology ID:
2026.acl-industry.88
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yunyao Li, Georg Rehm, Mei Tu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1265–1276
Language:
URL:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-industry.88/
DOI:
Bibkey:
Cite (ACL):
Yirou Ge, Yixi Li, Alec M. Chiu, Shivani Shekhar, Zijie Pan, Avinash Thangali, Yun-Shiuan Chuang, Chaitanya Kulkarni, Uma Kona, Linsey Pang, and Prakhar Mehrotra. 2026. Ada-RS: Adaptive Rejection Sampling for Selective Thinking. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 1265–1276, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Ada-RS: Adaptive Rejection Sampling for Selective Thinking (Ge et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-industry.88.pdf