Monolingual Adapter Networks for Efficient Cross-Lingual Alignment

Pulkit Arya


Abstract
Multilingual alignment for low-resource languages is a challenge for embedding models. The scarcity of parallel datasets in addition to rich morphological diversity in languages adds to the complexity of training multilingual embedding models. To aid in the development of multilingual models for under-represented languages such as Sanskrit, we introduce GitaDB: a collection of 640 Sanskrit verses translated in 5 Indic languages and English. We benchmarked various state-of-the-art embedding models on our dataset in different bilingual and cross-lingual semantic retrieval tasks of increasing complexity and found a steep degradation in retrieval scores. We found a wide margin in the retrieval performance between English and Sanskrit targets. To bridge this gap, we introduce Monolingual Adapter Networks: a parameter-efficient method to bolster cross-lingual alignment of embedding models without the need for parallel corpora or full finetuning.
Anthology ID:
2025.mrl-main.24
Volume:
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:
November
Year:
2025
Address:
Suzhuo, China
Editors:
David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:
MRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
360–368
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.24/
DOI:
Bibkey:
Cite (ACL):
Pulkit Arya. 2025. Monolingual Adapter Networks for Efficient Cross-Lingual Alignment. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 360–368, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):
Monolingual Adapter Networks for Efficient Cross-Lingual Alignment (Arya, MRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.24.pdf