Dingwei Chen
2025
Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation
Dingwei Chen
|
Ziqiang Liu
|
Feiteng Fang
|
Chak Tou Leong
|
Shiwen Ni
|
Ahmadreza Argha
|
Hamid Alinejad-Rokny
|
Min Yang
|
Chengming Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) demonstrate remarkable capabilities in text understanding and generation. However, their tendency to produce factually inconsistent outputs—commonly referred to as “hallucinations”—remains a critical challenge. Existing approaches, such as retrieval-based and inference-time correction methods, primarily address this issue at the input or output level, often overlooking the intrinsic information refinement process and the role of premature layers. Meanwhile, alignment- and fine-tuning-based methods are resource-intensive. In this paper, we propose **PLI** (**P**remature **L**ayers **I**nterpolation), a novel, training-free, and plug-and-play intervention designed to enhance factuality. PLI mitigates hallucinations by inserting premature layers formed through mathematical interpolation with adjacent layers. Inspired by stable diffusion and sampling steps, PLI extends the depth of information processing and transmission in LLMs, improving factual coherence. Experiments on four publicly available datasets demonstrate that PLI effectively reduces hallucinations while outperforming existing baselines in most cases. Further analysis suggests that the success of layer interpolation is closely linked to LLMs’ internal mechanisms. To promote reproducibility, we will release our code and data upon acceptance.
2024
Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models
Shiwen Ni
|
Dingwei Chen
|
Chengming Li
|
Xiping Hu
|
Ruifeng Xu
|
Min Yang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advancements in Large Language Models (LLMs) have showcased their remarkable capabilities in text understanding and generation. However, even stronger LLMs are susceptible to acquiring erroneous or obsolete information from the training corpus. Direct secondary fine-tuning with data containing new knowledge may be ineffective in updating knowledge due to the conflict between old and new knowledge. In this paper, we propose a new paradigm for fine-tuning called F-Learning (Forgetting before Learning), which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge. Experimental results on two publicly available datasets demonstrate that our proposed F-Learning can obviously improve the knowledge updating performance of both full fine-tuning and LoRA fine-tuning, simultaneously outperforming the existing baselines in most cases. Moreover, we have also discovered that forgetting old knowledge by subtracting the parameters of LoRA can yield a similar effect to subtracting the parameters of full fine-tuning, and occasionally even surpass it significantly.
Search
Fix author
Co-authors
- Chengming Li 2
- Shiwen Ni 2
- Min Yang 2
- Hamid Alinejad-Rokny 1
- Ahmadreza Argha 1
- show all...