RECOST: External Knowledge Guided Data-efficient Instruction Tuning

Qi Zhang; Yiming Zhang; Haobo Wang; Junbo Zhao

doi:10.18653/v1/2024.findings-acl.648

RECOST: External Knowledge Guided Data-efficient Instruction Tuning

Qi Zhang, Yiming Zhang, Haobo Wang, Junbo Zhao

Abstract

In the current landscape of large language models (LLMs), the process of instruction tuning serves as an essential step. Considering the high computing power overhead, data-efficient instruction tuning was proposed to reduce the training data size in this process, aiming at selecting high-quality instructional data. Nevertheless, we argue that most current data-efficient instruction-tuning methods are highly dependent on the quality of the original instruction-tuning dataset. When it comes to datasets synthesized by LLMs, a common scenario in this field, dirty samples will even be selected with a higher probability than other samples. To address these challenges, we utilized external knowledge (relevant examples or paragraphs) to evaluate those samples synthesized by LLMs with an in-context-based relative predictive entropy. Based on the new metric, we proposed a framework, dubbed as RECOST, which integrates external-knowledge-base re-ranking and diversity-consistent sampling into a single pipeline. Through extensive experiments on several synthetic datasets (Alpaca and Alpaca-gpt4), we demonstrate the effectiveness of our method and achieve even better results with only 1% of the full dataset.

Anthology ID:: 2024.findings-acl.648
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10911–10921
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.findings-acl.648/
DOI:: 10.18653/v1/2024.findings-acl.648
Bibkey:
Cite (ACL):: Qi Zhang, Yiming Zhang, Haobo Wang, and Junbo Zhao. 2024. RECOST: External Knowledge Guided Data-efficient Instruction Tuning. In Findings of the Association for Computational Linguistics: ACL 2024, pages 10911–10921, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: RECOST: External Knowledge Guided Data-efficient Instruction Tuning (Zhang et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.findings-acl.648.pdf

PDF Cite Search Fix data