Yiwen Wu
2025
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han
|
Liang Du
|
Hongwei Du
|
Xiangguo Zhou
|
Yiwen Wu
|
Yuanfang Zhang
|
Weibo Zheng
|
Donghong Han
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Despite the recent efforts from the NLP community, balancing the training budget, downstream performance, and general capabilities of large language models (LLM) remains a challenge in many applications. Training the entire model for downstream tasks is expensive, and could easily result in catastrophic forgetting. Parameter-efficient fine-tuning (PEFT) could reduce the training cost, but it still suffers from forgetting, and limits the learning on the downstream tasks. To address the aforementioned issues, we propose a novel mixture of expert (MoE) framework based on Soft LoRA and Identity Mixture (SLIM). SLIM allows dynamic routing between LoRA adapters and identity layers, thus enabling the bypass of LoRA adapters to suppress forgetting of general capacity. We adopt weight yielding with sliding clustering for better out-of-domain distinguish to enhance the routing. We also convert the mixture of LoRA adapters to the model merging formulation and introduce dynamic merging with its fast implementation for LoRA adapters to keep the general capabilities. Extensive experiments demonstrate that the proposed SLIM is comparable to the state-of-the-art PEFT approaches on the downstream tasks while achieving the leading performance in mitigating catastrophic forgetting. We plan to open-source the code upon publication.
2024
How Grammatical Features Impact Machine Translation: A New Test Suite for Chinese-English MT Evaluation
Huacheng Song
|
Yi Li
|
Yiwen Wu
|
Yu Liu
|
Jingxia Lin
|
Hongzhi Xu
Proceedings of the Ninth Conference on Machine Translation
Machine translation (MT) evaluation has evolved toward a trend of fine-grained granularity, enabling a more precise diagnosis of hidden flaws and weaknesses of MT systems from various perspectives. This paper examines how MT systems are potentially affected by certain grammatical features, offering insights into the challenges these features pose and suggesting possible directions for improvement. We develop a new test suite by extracting 7,848 sentences from a multi-domain Chinese-English parallel corpus. All the Chinese text was further annotated with 43 grammatical features using a semi-automatic method. This test suite was subsequently used to evaluate eight state-of-the-art MT systems according to six different automatic evaluation metrics. The results reveal intriguing patterns of MT performance associated with different domains and various grammatical features, highlighting the test suite’s effectiveness. The test suite was made publicly available and it will serve as an important benchmark for evaluating and diagnosing Chinese-English MT systems.