Call for Rigor in Reporting Quality of Instruction Tuning Data

Hyeonseok Moon; Jaehyung Seo; Heui-Seok Lim

doi:10.18653/v1/2025.acl-short.9

Call for Rigor in Reporting Quality of Instruction Tuning Data

Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim

Abstract

Instruction tuning is crucial for adapting large language models (LLMs) to align with user intentions. Numerous studies emphasize the significance of the quality of instruction tuning (IT) data, revealing a strong correlation between IT data quality and the alignment performance of LLMs. In these studies, the quality of IT data is typically assessed by evaluating the performance of LLMs trained with that data. However, we identified a prevalent issue in such practice: hyperparameters for training models are often selected arbitrarily without adequate justification. We observed significant variations in hyperparameters applied across different studies, even when training the same model with the same data. In this study, we demonstrate the potential problems arising from this practice and emphasize the need for careful consideration in verifying data quality. Through our experiments on the quality of LIMA data and a selected set of 1,000 Alpaca data points, we demonstrate that arbitrary hyperparameter decisions can make any arbitrary conclusion.

Anthology ID:: 2025.acl-short.9
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 100–109
Language:
URL:: https://preview.aclanthology.org/corrections-2025-08/2025.acl-short.9/
DOI:: 10.18653/v1/2025.acl-short.9
Bibkey:
Cite (ACL):: Hyeonseok Moon, Jaehyung Seo, and Heuiseok Lim. 2025. Call for Rigor in Reporting Quality of Instruction Tuning Data. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 100–109, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Call for Rigor in Reporting Quality of Instruction Tuning Data (Moon et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2025-08/2025.acl-short.9.pdf

PDF Cite Search Fix data