Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models
Hyeonseok Moon, Jaehyung Seo, Seungyoon Lee, Chanjun Park, Heuiseok Lim
Abstract
Through numerous endeavors, large language models (LLMs) have witnessed significant advancements in their instruction-following capability. However, we discern that LLMs are prone to generate responses to instruction-formatted statements in an instinctive manner, rather than comprehending the underlying user intention reside within the given instructions. We also recognize that the significance of instruction understanding capability is largely overlooked in most of LLM evaluation benchmarks. To ensure more comprehensive evaluation on the instruction understanding capability of LLM, we propose Intention of Instruction (IntInst) benchmark, which primary objective is to distinguish the appropriate instruction that accurately instruct to generate a given context. IntInst presents four instruction candidates and requires LLMs to select one among them. Through extensive experiments with several instruction-tuned LLMs, we reveal that most LLMs struggle to grasp the actual intention concealed in the instruction and thoroughly analyze the factors influencing instruction understanding.- Anthology ID:
- 2025.findings-naacl.330
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2025
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5944–5964
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.330/
- DOI:
- Cite (ACL):
- Hyeonseok Moon, Jaehyung Seo, Seungyoon Lee, Chanjun Park, and Heuiseok Lim. 2025. Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 5944–5964, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models (Moon et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.330.pdf