Self-play through Computational Runtimes improves Chart Reasoning

Tautvydas Misiūnas, Hassan Mansoor, Jasper Uijlings, Oriana Riva, Victor Carbune


Abstract
Vision-language models (VLMs) achieve impressive zero-shot performance on multimodal reasoning tasks. Typically, best reported performance is achieved with a zero- or a few-shot prompt. We observe that asking the model to take other routes of solving the same task, such as through code generation, hurts performance. Furthermore, training sets are typically no longer useful for improving model performance through few-shot learning, due to their use in training. Indeed, we observe that auto-prompting techniques such as DSPy (CITATION), when applied on training sets, do not produce few-shot examples that further improve validation performance. Further, when used in conjunction with program-of-thought, performance becomes even worse.Our work overcomes these limitations by introducing a novel self-play programming interface which leverages the ability of VLMs to first generate code to decompose a complex visual reasoning task in sub-tasks, then use itself, or other models, as a tool to solve decomposed tasks. Our approach enables DSPy to not suffer from performance drops, when applied iteratively on training sets. Furthermore, it outperforms zero-shot baselines on difficult chart reasoning benchmarks. We report the performance of our approach on ChartQA, PlotQA and ChartFC. This enables large models, such as Gemini or GPT to autonomously learn how to use themselves as tools and iteratively improve without the need for additional data.
Anthology ID:
2025.findings-acl.559
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10731–10746
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.559/
DOI:
Bibkey:
Cite (ACL):
Tautvydas Misiūnas, Hassan Mansoor, Jasper Uijlings, Oriana Riva, and Victor Carbune. 2025. Self-play through Computational Runtimes improves Chart Reasoning. In Findings of the Association for Computational Linguistics: ACL 2025, pages 10731–10746, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Self-play through Computational Runtimes improves Chart Reasoning (Misiūnas et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.559.pdf