LUMINA: Long-horizon Understanding for Multi-turn Interactive Agents

Amin Rakhsha, Thomas Hehn, Pietro Mazzaglia, Fabio Valerio Massoli, Arash Behboodi, Tribhuvanesh Orekondy


Abstract
Large language models can perform well on many isolated tasks, yet they continue to struggle on multi-turn, long-horizon agentic problems that require skills such as planning, state tracking, and long context processing. In this work, we aim to better understand the relative importance of advancing these underlying capabilities for success on such tasks. We develop an oracle counterfactual framework for multi-turn problems that asks: how would an agent perform if it could leverage an oracle to perfectly execute a specific skill? The change in the agent’s performance due to this oracle assistance allows us to measure the criticality of that skill in the future advancement of AI agents. We introduce a suite of procedurally generated, game-like tasks with tunable complexity. These controlled environments allow us to provide precise oracle interventions, such as perfect planning or flawless state tracking, and make it possible to isolate the contribution of each oracle without confounding effects present in real-world benchmarks. Our results show that while some interventions (e.g., planning) consistently improve performance across settings, the usefulness of other skills is dependent on the properties of the environment and language model. Our work sheds light on the challenges of multi-turn agentic environments to guide the future efforts in the development of AI agents and language models.
Anthology ID:
2026.findings-acl.190
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3913–3926
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.190/
DOI:
Bibkey:
Cite (ACL):
Amin Rakhsha, Thomas Hehn, Pietro Mazzaglia, Fabio Valerio Massoli, Arash Behboodi, and Tribhuvanesh Orekondy. 2026. LUMINA: Long-horizon Understanding for Multi-turn Interactive Agents. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3913–3926, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
LUMINA: Long-horizon Understanding for Multi-turn Interactive Agents (Rakhsha et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.190.pdf
Checklist:
 2026.findings-acl.190.checklist.pdf