Acquisition and Application of Novel Knowledge in Large Language Models
Ziyu Shang, Jianghan Liu, Zhizhao Luo, Peng Wang, Wenjun Ke, Jiajun Liu, Zijie Xu, Guozheng Li
Abstract
Recent advancements in large language models (LLMs) have demonstrated their impressive generative capabilities, primarily due to their extensive parameterization, which enables them to encode vast knowledge. However, effectively integrating new knowledge into LLMs remains a major challenge. Current research typically first constructs novel knowledge datasets and then injects this knowledge into LLMs through various techniques. However, existing methods for constructing new datasets either rely on timestamps, which lack rigor, or use simple templates for synthesis, which are simplistic and do not accurately reflect the real world. To address this issue, we propose a novel knowledge dataset construction approach that simulates biological evolution using knowledge graphs to generate synthetic entities with diverse attributes, resulting in a dataset, NovelHuman. Systematic analysis on NovelHuman reveals that the intra-sentence position of knowledge significantly affects the acquisition of knowledge. Therefore, we introduce an intra-sentence permutation to enhance knowledge acquisition. Furthermore, given that potential conflicts exist between autoregressive (AR) training objectives and permutation-based learning, we propose PermAR, a permutation-based language modeling framework for AR models. PermAR seamlessly integrates with mainstream AR architectures, endowing them with bidirectional knowledge acquisition capabilities. Extensive experiments demonstrate the superiority of PermAR, outperforming knowledge augmentation methods by 3.3%-38%.- Anthology ID:
- 2025.acl-long.898
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 18348–18368
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.898/
- DOI:
- Cite (ACL):
- Ziyu Shang, Jianghan Liu, Zhizhao Luo, Peng Wang, Wenjun Ke, Jiajun Liu, Zijie Xu, and Guozheng Li. 2025. Acquisition and Application of Novel Knowledge in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 18348–18368, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Acquisition and Application of Novel Knowledge in Large Language Models (Shang et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.898.pdf