Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Xiaoying Zhang; Baolin Peng; Ye Tian; Jingyan Zhou; Yipeng Zhang; Haitao Mi; Helen Meng

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen M. Meng

Abstract

Large language models (LLMs) often struggle to provide up-to-date information due to their one-time training and the constantly evolving nature of the world. To keep LLMs current, existing approaches typically involve continued pre-training on new documents. However, they frequently face difficulties in extracting stored knowledge. Motivated by the remarkable success of the Feynman Technique in efficient human learning, we introduce Self-Tuning, a learning framework aimed at improving an LLM’s ability to effectively acquire new knowledge from unseen raw documents through self-teaching. Specifically, we develop a Self-Teaching strategy that augments the documents with a set of knowledge-intensive tasks created in a self-supervised manner, focusing on three crucial aspects: memorization, comprehension, and self-reflection. Additionally, we introduce three Wiki-Newpages-2023-QA datasets to facilitate an in-depth analysis of an LLM’s knowledge acquisition ability concerning memorization, extraction, and reasoning. Extensive experimental results on various models, e.g., Llama2-7B reveal that Self-Tuning consistently exhibits superior performance across all knowledge acquisition tasks and excels in preserving previous knowledge.

Anthology ID:: 2025.findings-acl.297
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5688–5724
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.297/
DOI:
Bibkey:
Cite (ACL):: Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, and Helen M. Meng. 2025. Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching. In Findings of the Association for Computational Linguistics: ACL 2025, pages 5688–5724, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching (Zhang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.297.pdf

PDF Cite Search Fix data