Hongliang Li

Other people with similar names: Hongliang Li

Unverified author pages with similar names: Hongliang Li

2026

Group-Merger: A LoRA-based Framework for Multilingual Continual Learning
Weijian yi | Hongliang Li | Jinan Xu
Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM 2026)

Multilingual continual learning (MCL) is crucial for enabling language models to adapt across diverse linguistic environments while retaining knowledge over time. Existing parameter isolation methods allocate language-specific modules but fail to leverage cross-lingual transfer, leading to inefficient parameter growth and poor generalization. Model merging based approaches suffer from severe performance degradation as the number of language-specific tasks increases, due to interference between linguistic and task-specific knowledge. To address these challenges, we propose Group-Merger, a framework that employs group-wise merging to balance parameter efficiency and continual learning performance. Our framework mitigates catastrophic forgetting across languages while enabling knowledge transfer. Extensive experiments on multilingual evaluation benchmarks demonstrate superior performance compared to existing methods.

pdf bib abs

Multilingual retrieval-augmented generation (MRAG) requires models to effectively acquire and integrate beneficial external knowledge from multilingual collections. However, most existing studies employ a unitive process where queries of equivalent semantics across different languages are processed through a single-turn retrieval and subsequent optimization. Such a “one-size-fits-all” strategy is often suboptimal in multilingual settings, as the models occur to knowledge bias and conflict during the interaction with the search engine. To alleviate the issues, we propose LcRL, a multilingual search-augmented reinforcement learning framework that integrates a language-coupled Group Relative Policy Optimization into the policy and reward models. We adopt the language-coupled group sampling in the rollout module to reduce knowledge bias, and regularize an auxiliary anti-consistency penalty in the reward models to mitigate the knowledge conflict. Experimental results demonstrate that not only achieves competitive performance but is also appropriate for various practical scenarios such as constrained training data and retrieval over collections encompassing a large number of languages. Our code is available at https://anonymous.4open.science/r/LcRL-B4EF.

2025

pdf bib abs

In recent years, large language models (LLMs) have made significant progress in knowledge-intensive applications. However, when adapting them to specific domains, we may encounter a multi-stage continuous learning scenario, especially in cases where domain knowledge evolves rapidly.This issue severely limits traditional fine-tuning approaches for LLMs.To overcome this limitation, we propose a new learning paradigm designed specifically for multi-stage continuous learning. This paradigm includes a preference-based learning bias to identify potential knowledge conflicts, as well as a self-distillation-based data augmentation strategy to expand and enrich the training corpus, thereby improving the integration of knowledge-compatible information.In the experiments, we show that our proposed method achieves a significant improvement in accuracy after 7 stages of fine-tuning compared to previous methods, while also demonstrating excellent performance in preserving general knowledge.We have released our code and dataset at Multi-Stage-Learning.

pdf bib abs

The robustness and security of Large Language Models (LLMs) face increasing threats, especially in multilingual settings. A notable vulnerability is “jailbreaking” via translating harmful queries into rare or underrepresented languages, which often bypasses existing safeguards. In this work, we propose Multilingual Collaborative Defense (MCD), a novel learning method that optimizes a continuous soft safety prompt automatically to facilitate multilingual safeguarding of LLMs. MCD organically leverages collaborative signals from multiple languages by rotating each as the training “center,” allowing auxiliary languages to reinforce safety prompt learning and ensuring cross‐lingual consistency. As a result, MCD improves defense performance across all languages, reduces false refusals, and mitigates safety misalignment caused by corpus imbalance. To evaluate MCD, we construct multilingual versions of jailbreak benchmarks such as MaliciousInstruct and AdvBench, including zero-shot languages, to assess language transferability. Experiments show that MCD outperforms prior approaches in multilingual jailbreak defense while exhibiting strong cross-lingual generalization. Our code is available at https://github.com/HLiang-Lee/MCD.

Co-authors

Venues

Fix author