Visual reasoning is crucial for multimodal large language models (MLLMs) to address complex chart queries, yet high-quality rationale data remains scarce. Existing methods leveraged (M)LLMs for data generation, but direct prompting often yields limited precision and diversity. In this paper, we propose
Chain of Functions (CoF), a novel programmatic reasoning data generation pipeline that utilizes freely-explored reasoning paths as supervision to ensure data precision and diversity. Specifically, it starts with human-free exploration among the atomic functions (e.g., maximum data and arithmetic operations) to generate diverse function chains, which are then translated into linguistic rationales and questions with only a moderate open-sourced LLM.
CoF provides multiple benefits: 1) Precision: function-governed generation reduces hallucinations compared to freeform generation; 2) Diversity: enumerating function chains enables varied question taxonomies; 3) Explainability: function chains serve as built-in rationales, allowing fine-grained evaluation beyond overall accuracy; 4) Practicality: it eliminates reliance on extremely large models. Employing
CoF, we construct the
ChartCoF dataset, with 1.4k complex reasoning Q&A for fine-grained analysis and 50k Q&A for reasoning enhancement.Experiments show that
ChartCoF improves performance for MLLMs on widely used benchmarks, and the fine-grained evaluation on
ChartCoF reveals varying performance across question taxonomies and step numbers for each MLLM. Furthermore, the novel paradigm of function-governed rationale generation in
CoF could inspire broader applications beyond charts. The code and data have been publicly available at
https://github.com/microsoft/Chain-of-Functions.