How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Minglai Yang, Ethan Huang, Liang Zhang, Mihai Surdeanu, William Yang Wang, Liangming Pan


Abstract
We introduce Grade School Math with Distracting Context (GSM-DC), a synthetic benchmark to evaluate Large Language Models’ (LLMs) reasoning robustness against systematically controlled irrelevant context (IC). GSM-DC constructs symbolic reasoning graphs with precise distractor injections, enabling rigorous, reproducible evaluation. Our experiments demonstrate that LLMs are significantly sensitive to IC, affecting both reasoning path selection and arithmetic accuracy. Additionally, training models with strong distractors improves performance in both in-distribution and out-of-distribution scenarios. We further propose a stepwise tree search guided by a process reward model, which notably enhances robustness in out-of-distribution conditions.
Anthology ID:
2025.emnlp-main.674
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13340–13358
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.674/
DOI:
Bibkey:
Cite (ACL):
Minglai Yang, Ethan Huang, Liang Zhang, Mihai Surdeanu, William Yang Wang, and Liangming Pan. 2025. How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 13340–13358, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark (Yang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.674.pdf
Checklist:
 2025.emnlp-main.674.checklist.pdf