Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering

Chak Tou Leong; Dingwei Chen; Heming Xia; Qingyu Yin; Sunbowen Lee; Jian Wang; Wenjie Li

Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering

Chak Tou Leong, Dingwei Chen, Heming Xia, Qingyu Yin, Sunbowen Lee, Jian Wang, Wenjie Li

Abstract

Large reasoning models (LRMs) have achieved remarkable success through step-by-step chains of thought, yet they often suffer from excessive redundancy or unfaithful reasoning. Existing methods for shaping LRM behavior typically rely on reinforcement learning or fine-tuning with gold-standard reasoning traces, a paradigm that is both computationally expensive and difficult to scale. In this paper, we reveal that LRMs possess latent reasoning beliefs that internally track their own reasoning traits, which can be captured through simple logit probing without specialized training. Building on this insight, we propose Reasoning Belief Engineering (), a simple yet effective framework that shapes LRM behavior by aligning the model’s self-concept with a target belief blueprint. Crucially, completely bypasses the need for reasoning-trace supervision. It internalizes desired traits by fine-tuning on synthesized, self-reflective QA pairs that affirm the target belief. Extensive experiments on efficiency and faithfulness tasks demonstrate that matches or outperforms behavior-supervised and preference-based baselines while requiring significantly lower training costs. Our analysis further validates that shifting a model’s reasoning belief effectively shapes its actual behavior.

Anthology ID:: 2026.findings-acl.218
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4444–4467
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.218/
DOI:
Bibkey:
Cite (ACL):: Chak Tou Leong, Dingwei Chen, Heming Xia, Qingyu Yin, Sunbowen Lee, Jian Wang, and Wenjie Li. 2026. Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering. In Findings of the Association for Computational Linguistics: ACL 2026, pages 4444–4467, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering (Leong et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.218.pdf
Checklist:: 2026.findings-acl.218.checklist.pdf

PDF Cite Search Checklist Fix data