Amir H. Rezaeian
2026
Budget-Aware Anytime Reasoning with LLM-Synthesized Preference Data
Xuanming Zhang | Shwan Ashrafi | Aziza Mirsaidova | Amir H. Rezaeian | Miguel Ballesteros | Lydia Chilton | Zhou Yu | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
Xuanming Zhang | Shwan Ashrafi | Aziza Mirsaidova | Amir H. Rezaeian | Miguel Ballesteros | Lydia Chilton | Zhou Yu | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
We study the reasoning behavior of large language models (LLMs) under limited computation budgets. In such settings, producing useful partial solutions quickly is often more practical than exhaustive reasoning, which incurs high inference costs. Many real-world tasks, such as trip planning, require models to deliver the best possible output within a fixed reasoning budget. We introduce an anytime reasoning framework and the Anytime Index, a metric that quantifies how effectively solution quality improves as reasoning tokens increase. To further enhance efficiency, we propose an inference-time self-improvement method using LLM-synthesized preference data, where models learn from their own reasoning comparisons to produce better intermediate solutions. Experiments on NaturalPlan (Trip), AIME, and GPQA datasets show consistent gains across Grok-3, GPT-oss, GPT-4.1/4o, and LLaMA models, improving both reasoning quality and efficiency under budget constraints.
Barriers to Discrete Reasoning with Transformers: A Survey Across Depth, Exactness, and Bandwidth
Michelle Yuan | Weiyi Sun | Amir H. Rezaeian | Jyotika Singh | Sandip Ghoshal | Yao-Ting Wang | Miguel Ballesteros | Yassine Benajiba
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Michelle Yuan | Weiyi Sun | Amir H. Rezaeian | Jyotika Singh | Sandip Ghoshal | Yao-Ting Wang | Miguel Ballesteros | Yassine Benajiba
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Transformers have become the foundational architecture for a broad spectrum of sequence modeling applications, underpinning state-of-the-art systems in natural language processing, vision, and beyond. However, their theoretical limitations in discrete reasoning tasks, such as arithmetic, logical inference, and algorithmic composition, remain a critical open problem. In this survey, we synthesize recent advances from three theoretical perspectives: circuit complexity, approximation theory, and communication complexity, to clarify the structural and computational barriers that transformers face when performing symbolic computations. By connecting these established theoretical frameworks, we provide an accessible and unified account of why current transformer architectures struggle to implement exact discrete algorithms, even as they excel at pattern matching and interpolation. We review key definitions, seminal results, and illustrative examples, highlighting challenges such as depth constraints, difficulty approximating discontinuities, and bottlenecks in inter-token communication. Finally, we discuss implications for model design and suggest promising directions for overcoming these foundational limitations.