AIR: Complex Instruction Generation via Automatic Iterative Refinement

Wei Liu, Yancheng He, Yu Li, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng


Abstract
With the development of large language models, their ability to follow simple instructions has significantly improved. However, adhering to complex instructions remains a major challenge. Current approaches to generating complex instructions are often irrelevant to the current instruction requirements or suffer from limited scalability and diversity. Moreover, methods such as back-translation, while effective for simple instruction generation, fail to leverage the rich knowledge and formatting in human written documents. In this paper, we propose a novel **A**utomatic **I**terative **R**efinement (**AIR**) framework to generate complex instructions with constraints, which not only better reflects the requirements of real scenarios but also significantly enhances LLMs’ ability to follow complex instructions. The AIR framework consists of two stages: 1) Generate an initial instruction from a document; 2) Iteratively refine instructions with LLM-as-judge guidance by comparing the model’s output with the document to incorporate valuable constraints. Finally, we construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model’s ability to follow complex instructions, outperforming existing methods for instruction generation.
Anthology ID:
2025.emnlp-main.1628
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31952–31974
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1628/
DOI:
Bibkey:
Cite (ACL):
Wei Liu, Yancheng He, Yu Li, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, and Bo Zheng. 2025. AIR: Complex Instruction Generation via Automatic Iterative Refinement. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 31952–31974, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
AIR: Complex Instruction Generation via Automatic Iterative Refinement (Liu et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1628.pdf
Checklist:
 2025.emnlp-main.1628.checklist.pdf