Out-of-Context Reasoning in Large Language Models

Jonathan Shaki; Emanuele La Malfa; Michael J. Wooldridge; Sarit Kraus

doi:10.18653/v1/2025.findings-emnlp.1068

Out-of-Context Reasoning in Large Language Models

Jonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, Sarit Kraus

Abstract

We study how large language models (LLMs) reason about memorized knowledge through simple binary relations such as equality (=), inequality (<), and inclusion (⊂). Unlike in-context reasoning, the axioms (e.g., a < b, b < c) are only seen during training and not provided in the task prompt (e.g., evaluating a < c). The tasks require one or more reasoning steps, and data aggregation from one or more sources, showing performance change with task complexity. We introduce a lightweight technique, out-of-context representation learning, which trains only new token embeddings on axioms and evaluates them on unseen tasks. Across reflexivity, symmetry, and transitivity tests, LLMs mostly perform statistically significant better than chance, making the correct answer extractable when testing multiple phrasing variations, but still fall short of consistent reasoning on every single query. Analysis shows that the learned embeddings are organized in structured ways, suggesting real relational understanding. Surprisingly, it also indicates that the core reasoning happens during the training, not inference.

Anthology ID:: 2025.findings-emnlp.1068
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19606–19615
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1068/
DOI:: 10.18653/v1/2025.findings-emnlp.1068
Bibkey:
Cite (ACL):: Jonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, and Sarit Kraus. 2025. Out-of-Context Reasoning in Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 19606–19615, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Out-of-Context Reasoning in Large Language Models (Shaki et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1068.pdf
Checklist:: 2025.findings-emnlp.1068.checklist.pdf

PDF Cite Search Checklist Fix data