AIR: Complex Instruction Generation via Automatic Iterative Refinement
Wei Liu, Yancheng He, Yu Li, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng
Abstract
With the development of large language models, their ability to follow simple instructions has significantly improved. However, adhering to complex instructions remains a major challenge. Current approaches to generating complex instructions are often irrelevant to the current instruction requirements or suffer from limited scalability and diversity. Moreover, methods such as back-translation, while effective for simple instruction generation, fail to leverage the rich knowledge and formatting in human written documents. In this paper, we propose a novel **A**utomatic **I**terative **R**efinement (**AIR**) framework to generate complex instructions with constraints, which not only better reflects the requirements of real scenarios but also significantly enhances LLMs’ ability to follow complex instructions. The AIR framework consists of two stages: 1) Generate an initial instruction from a document; 2) Iteratively refine instructions with LLM-as-judge guidance by comparing the model’s output with the document to incorporate valuable constraints. Finally, we construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model’s ability to follow complex instructions, outperforming existing methods for instruction generation.- Anthology ID:
- 2025.emnlp-main.1628
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 31952–31974
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1628/
- DOI:
- Cite (ACL):
- Wei Liu, Yancheng He, Yu Li, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, and Bo Zheng. 2025. AIR: Complex Instruction Generation via Automatic Iterative Refinement. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 31952–31974, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- AIR: Complex Instruction Generation via Automatic Iterative Refinement (Liu et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1628.pdf