Nguyen-Phuong Phan


2025

pdf bib
KDA: Knowledge Distillation Adapter for Cross-Lingual Transfer
Ta-Bao Nguyen | Nguyen-Phuong Phan | Tung Le | Huy Tien Nguyen
Proceedings of the 18th International Natural Language Generation Conference

State-of-the-art cross-lingual transfer often relies on massive multilingual models, but their prohibitive size and computational cost limit their practicality for low-resource languages. An alternative is to adapt powerful, task-specialized monolingual models, but this presents challenges in bridging the vocabulary and structural gaps between languages. To address this, we propose KDA, a Knowledge Distillation Adapter framework that efficiently adapts a fine-tuned, high-resource monolingual model to a low-resource target language. KDA utilizes knowledge distillation to transfer the source model’s task-solving capabilities to the target language in a parameter-efficient manner. In addition, we introduce a novel adapter architecture that integrates source-language token embeddings while learning new positional embeddings, directly mitigating cross-lingual representational mismatches. Our empirical results on zero-shot transfer for Vietnamese Sentiment Analysis demonstrate that KDA significantly outperforms existing methods, offering a new, effective, and computationally efficient pathway for cross-lingual transfer.