Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents

Akriti Jain, Pritika Ramu, Aparna Garimella, Apoorv Saxena


Abstract
Large Language Models (LLMs) have demonstrated strong capabilities in transforming text descriptions or tables to data visualizations via instruction-tuning methods. However, it is not straightforward to apply these methods directly for a more real-world use case of visualizing data from long documents based on user-given intents, as opposed to the user pre-selecting the relevant content manually. We introduce the task of _intent-based chart generation_ from documents: given a user-specified intent and document(s), the goal is to generate a chart adhering to the intent and grounded on the document(s) in a zero-shot setting. We propose an unsupervised, two-staged framework in which an LLM first extracts relevant information from the document(s) by decomposing the intent and iteratively validates and refines this data. Next, a heuristic-guided module selects an appropriate chart type before final code generation. To assess the data accuracy of the generated charts, we propose an attribution-based metric that uses a structured textual representation of charts, instead of relying on visual decoding metrics that often fail to capture the chart data effectively. To validate our approach, we curate a dataset comprising of 1,242 <intent, document, charts> tuples from two domains, finance and scientific, in contrast to the existing datasets that are largely limited to parallel text descriptions/ tables and their corresponding charts. We compare our approach with baselines using single-shot chart generation using LLMs and query-based retrieval methods; our method outperforms by upto 9 points and 17 points in terms of chart data accuracy and chart type respectively over the best baselines.
Anthology ID:
2025.emnlp-main.1770
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
34936–34951
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1770/
DOI:
Bibkey:
Cite (ACL):
Akriti Jain, Pritika Ramu, Aparna Garimella, and Apoorv Saxena. 2025. Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34936–34951, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents (Jain et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1770.pdf
Checklist:
 2025.emnlp-main.1770.checklist.pdf