Yuan Meng
2025
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi
|
Yuhui Xu
|
Heng Chang
|
Yuan Meng
|
Tong Zhang
|
Jia Li
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) have advanced rapidly but face significant memory demands. While quantization has shown promise for LLMs, current methods typically require lengthy training to alleviate the performance degradation from quantization loss. However, deploying LLMs across diverse scenarios with different resource constraints, e.g., servers and personal computers, requires repeated training per application, which amplifies the lengthy training problem. Given that, it is advantageous to train a once-for-all (OFA) supernet capable of yielding diverse optimal subnets for downstream applications through one-shot training. Nonetheless, the scale of current language models impedes efficiency and amplifies interference from weight sharing between subnets. We make an initial attempt to extend the once-for-all framework to large language models. Specifically, we decouple shared weights to eliminate the interference and incorporate Low-Rank adapters for training efficiency. Furthermore, we observe the imbalance allocation of training resources from the traditional uniform sampling. A non-parametric scheduler is introduced to adjust the sampling rate for each quantization configuration, achieving a more balanced allocation among subnets with varying demands. We validate the approach on LLaMA2 families and Mistral on downstream evaluation, demonstrating high performance while significantly reducing deployment time faced with multiple scenarios.
Parameter-Aware Contrastive Knowledge Editing: Tracing and Rectifying based on Critical Transmission Paths
Songlin Zhai
|
Yuan Meng
|
Yuxin Zhang
|
Guilin Qi
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) have encoded vast amounts of knowledge in their parameters, but the acquired knowledge can sometimes be incorrect or outdated over time, necessitating rectification after pre-training. Traditional localized methods in knowledge-based model editing (KME) typically assume that knowledge is stored in particular intermediate layers. However, recent research suggests that these methods do not identify the optimal locations for parameter editing, as knowledge gradually accumulates across all layers in LLMs during the forward pass rather than being stored in specific layers. This paper, for the first time, introduces the concept of critical transmission paths into KME for parameter updating. Specifically, these paths capture the key information flows that significantly influence the model predictions for the editing process. To facilitate this process, we also design a parameter-aware contrastive rectifying algorithm that considers less important paths as contrastive examples. Experiments on two prominent datasets and three widely used LLMs demonstrate the superiority of our method in editing performance.
TEF: Causality-Aware Taxonomy Expansion via Front-Door Criterion
Yuan Meng
|
Songlin Zhai
|
Yuxin Zhang
|
Zhongjian Hu
|
Guilin Qi
Proceedings of the 31st International Conference on Computational Linguistics
Taxonomy expansion is a primary method for enriching taxonomies, involving appending a large number of additional nodes (i.e., queries) to an existing taxonomy (i.e., seed), with the crucial step being the identification of the appropriate anchor (parent node) for each query by incorporating the structural information of the seed. Despite advancements, existing research still faces an inherent challenge of spurious query-anchor matching, often due to various interference factors (e.g., the consistency of sibling nodes), resulting in biased identifications. To address the bias in taxonomy expansion caused by unobserved factors, we introduce the Structural Causal Model (SCM), known for its bias elimination capabilities, to prevent these factors from confounding the task through backdoor paths. Specifically, we employ the Front-Door Criterion, which guides the decomposition of the expansion process into a parser module and a connector. This enables the proposed causal-aware Taxonomy Expansion model to isolate confounding effects and reveal the true causal relationship between the query and the anchor. Extensive experiments on three benchmarks validate the effectiveness of TEF, with a notable 6.1% accuracy improvement over the state-of-the-art on the SemEval16-Environment dataset.
Search
Fix author
Co-authors
- Guilin Qi 2
- Songlin Zhai 2
- Yuxin Zhang 2
- Heng Chang 1
- Zhongjian Hu 1
- show all...