Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them

Guanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, Feng Chen


Abstract
Large language models (LLMs) have been able to perform various forms of reasoning tasks ina wide range of scenarios, but are they truly engaging in task abstraction and rule-based reasoning beyond mere memorization? To answer this question, we propose a novel experimentalapproach, Misleading Fine-Tuning (MisFT), to examine whether LLMs perform abstract reasoning by altering their original understanding of fundamental rules. In particular, by constructing datasets with math expressions or logical formulas that contradict correct principles, we fine-tune the model to learn those contradictory rules and assess its generalization ability on unseen test domains. Through a series of experiments, we find that current LLMs are capable of applying contradictory rules to solve practical math word problems and natural language reasoning tasks, implying the presence of an internal mechanism in LLMs that abstracts before reasoning.
Anthology ID:
2025.findings-emnlp.1102
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20263–20278
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1102/
DOI:
10.18653/v1/2025.findings-emnlp.1102
Bibkey:
Cite (ACL):
Guanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, and Feng Chen. 2025. Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 20263–20278, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them (Chen et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1102.pdf
Checklist:
 2025.findings-emnlp.1102.checklist.pdf