Jasper Uijlings


2025

pdf bib
Self-play through Computational Runtimes improves Chart Reasoning
Tautvydas Misiūnas | Hassan Mansoor | Jasper Uijlings | Oriana Riva | Victor Carbune
Findings of the Association for Computational Linguistics: ACL 2025

Vision-language models (VLMs) achieve impressive zero-shot performance on multimodal reasoning tasks. Typically, best reported performance is achieved with a zero- or a few-shot prompt. We observe that asking the model to take other routes of solving the same task, such as through code generation, hurts performance. Furthermore, training sets are typically no longer useful for improving model performance through few-shot learning, due to their use in training. Indeed, we observe that auto-prompting techniques such as DSPy (CITATION), when applied on training sets, do not produce few-shot examples that further improve validation performance. Further, when used in conjunction with program-of-thought, performance becomes even worse.Our work overcomes these limitations by introducing a novel self-play programming interface which leverages the ability of VLMs to first generate code to decompose a complex visual reasoning task in sub-tasks, then use itself, or other models, as a tool to solve decomposed tasks. Our approach enables DSPy to not suffer from performance drops, when applied iteratively on training sets. Furthermore, it outperforms zero-shot baselines on difficult chart reasoning benchmarks. We report the performance of our approach on ChartQA, PlotQA and ChartFC. This enables large models, such as Gemini or GPT to autonomously learn how to use themselves as tools and iteratively improve without the need for additional data.

2014

pdf bib
TUHOI: Trento Universal Human Object Interaction Dataset
Dieu-Thu Le | Jasper Uijlings | Raffaella Bernardi
Proceedings of the Third Workshop on Vision and Language

2013

pdf bib
Exploiting Language Models for Visual Recognition
Dieu-Thu Le | Jasper Uijlings | Raffaella Bernardi
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
VSEM: An open library for visual semantics representation
Elia Bruni | Ulisse Bordignon | Adam Liska | Jasper Uijlings | Irina Sergienya
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations