Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen


Abstract
High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present Tag-Instruct, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, Tag-Instruct compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that Tag-Instruct outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.
Anthology ID:
2025.findings-acl.911
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17708–17729
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.911/
DOI:
Bibkey:
Cite (ACL):
He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, and Guanhua Chen. 2025. Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation. In Findings of the Association for Computational Linguistics: ACL 2025, pages 17708–17729, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation (Zhu et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.911.pdf