PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Xiaojie Xu; Xinli Xu; Sirui Chen; Haoyu Chen; Fan Zhang (张帆); Ying-Cong Chen

doi:10.18653/v1/2025.findings-emnlp.165

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, Ying-Cong Chen

Abstract

Visual presentations are vital for effective communication. Early attempts to automate their creation using deep learning often faced issues such as poorly organized layouts, inaccurate text summarization, and a lack of image understanding, leading to mismatched visuals and text. These limitations restrict their application in formal contexts like business and scientific research. To address these challenges, we propose PreGenie, an agentic and modular framework powered by multimodal large language models (MLLMs) for generating high-quality visual presentations.PreGenie is built on the Slidev presentation framework, where slides are rendered from Markdown code. It operates in two stages: (1) Analysis and Initial Generation, which summarizes multimodal input and generates initial code, and (2) Review and Re-generation, which iteratively reviews intermediate code and rendered slides to produce final, high-quality presentations. Each stage leverages multiple MLLMs that collaborate and share information. Comprehensive experiments demonstrate that PreGenie excels in multimodal understanding, outperforming existing models in both aesthetics and content consistency, while aligning more closely with human design preferences.

Anthology ID:: 2025.findings-emnlp.165
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3045–3063
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.165/
DOI:: 10.18653/v1/2025.findings-emnlp.165
Bibkey:
Cite (ACL):: Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, and Ying-Cong Chen. 2025. PreGenie: An Agentic Framework for High-quality Visual Presentation Generation. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 3045–3063, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: PreGenie: An Agentic Framework for High-quality Visual Presentation Generation (Xu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.165.pdf
Checklist:: 2025.findings-emnlp.165.checklist.pdf

PDF Cite Search Checklist Fix data