Evaluating LLM-Generated Diagrams as Graphs

Chumeng Liang; Jiaxuan You

Evaluating LLM-Generated Diagrams as Graphs

Abstract

Diagrams play a central role in research papers for conveying ideas, yet they are often notoriously complex and labor-intensive to create. Although diagrams are presented as images, standard image generative models struggle to produce clear diagrams with well-defined structure. We argue that a promising direction is to generate demonstration diagrams directly in textual form as SVGs, which can leverage recent advances in large language models (LLMs). However, due to the complexity of components and the multimodal nature of diagrams, sufficiently discriminative and explainable metrics for evaluating the quality of LLM-generated diagrams remain lacking. In this paper, we propose DiagramEval, a novel evaluation metric designed to assess demonstration diagrams generated by LLMs. Specifically, DiagramEval conceptualizes diagrams as graphs, treating text elements as nodes and their connections as directed edges, and evaluates diagram quality using two new groups of metrics: node alignment and path alignment. For the first time, we effectively evaluate diagrams produced by state-of-the-art LLMs on recent research literature, quantitatively demonstrating the validity of our metrics. Furthermore, we show how the enhanced explainability of our proposed metrics offers valuable insights into the characteristics of LLM-generated diagrams.

Anthology ID:: 2025.emnlp-main.640
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12689–12701
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.640/
DOI:
Bibkey:
Cite (ACL):: Chumeng Liang and Jiaxuan You. 2025. Evaluating LLM-Generated Diagrams as Graphs. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 12689–12701, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Evaluating LLM-Generated Diagrams as Graphs (Liang & You, EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.640.pdf
Checklist:: 2025.emnlp-main.640.checklist.pdf

PDF Cite Search Checklist Fix data