ADO: Automatic Data Optimization for Inputs in LLM Prompts
Sam Lin, Wenyue Hua, Lingyao Li, Zhenting Wang, Yongfeng Zhang
Abstract
This study explores a novel approach to enhance the performance of Large Language Models (LLMs) through the optimization of input data within prompts. While previous research has primarily focused on refining instruction components and augmenting input data with in-context examples, our work investigates the potential benefits of optimizing the input data itself. We introduce a two-pronged strategy for input data optimization: content engineering and structural reformulation. Content engineering involves imputing missing values, removing irrelevant attributes, and enriching profiles by generating additional information inferred from existing attributes. Subsequent to content engineering, structural reformulation is applied to optimize the presentation of the modified content to LLMs, given their sensitivity to input format. Our findings suggest that these optimizations can significantly improve the performance of LLMs in various tasks, offering a promising avenue for future research in prompt engineering. The source code is available at https://github.com/glin2229/Automatic-Data-Optimization.- Anthology ID:
- 2025.findings-acl.1340
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venues:
- Findings | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 26134–26146
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1340/
- DOI:
- Cite (ACL):
- Sam Lin, Wenyue Hua, Lingyao Li, Zhenting Wang, and Yongfeng Zhang. 2025. ADO: Automatic Data Optimization for Inputs in LLM Prompts. In Findings of the Association for Computational Linguistics: ACL 2025, pages 26134–26146, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- ADO: Automatic Data Optimization for Inputs in LLM Prompts (Lin et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1340.pdf