There’s No Such Thing as Simple Reasoning for LLMs

Nurul Fajrin Ariyani, Zied Bouraoui, Richard Booth, Steven Schockaert


Abstract
Large Language Models (LLMs) have been widely found to struggle with logical reasoning, where even fine-tuned models fail dramatically on out-of-distribution problems. However, existing work has focused on relatively complex “many-hop” reasoning problems. In this paper, we analyse the performance of fine-tuned LLMs on simple reasoning problems, all of which can be solved in at most three inference steps. Due to the simplicity of these problems, the model cannot encounter test problems that are fundamentally different from those it has seen during training. Unfortunately, however, we find that the models remain highly brittle, being susceptible to seemingly innocent perturbations, such as the addition of duplicates to the set of premises and shuffling the order in which the premises are presented.
Anthology ID:
2025.findings-acl.232
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4503–4514
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.232/
DOI:
Bibkey:
Cite (ACL):
Nurul Fajrin Ariyani, Zied Bouraoui, Richard Booth, and Steven Schockaert. 2025. There’s No Such Thing as Simple Reasoning for LLMs. In Findings of the Association for Computational Linguistics: ACL 2025, pages 4503–4514, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
There’s No Such Thing as Simple Reasoning for LLMs (Ariyani et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.232.pdf