Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer

Abteen Ebrahimi, Adam Wiemerslage, Katharina von der Wense


Abstract
We present NN-Rank, an algorithm for ranking source languages for cross-lingual transfer, which leverages hidden representations from multilingual models and unlabeled target-language data. We experiment with two pretrained multilingual models and two tasks: part-of-speech tagging (POS) and named entity recognition (NER). We consider 51 source languages and evaluate on 56 and 72 target languages for POS and NER, respectively. When using in-domain data, NN-Rank beats state-of-the-art baselines that leverage lexical and linguistic features, with average improvements of up to 35.56 NDCG for POS and 18.14 NDCG for NER. As prior approaches can fall back to language-level features if target language data is not available, we show that NN-Rank remains competitive using only the Bible, an out-of-domain corpus available for a large number of languages. Ablations on the amount of unlabeled target data show that, for subsets consisting of as few as 25 examples, NN-Rank produces high-quality rankings which achieve 92.8% of the NDCG achieved using all available target data for ranking.
Anthology ID:
2025.emnlp-main.1650
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32404–32449
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1650/
DOI:
Bibkey:
Cite (ACL):
Abteen Ebrahimi, Adam Wiemerslage, and Katharina von der Wense. 2025. Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 32404–32449, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer (Ebrahimi et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1650.pdf
Checklist:
 2025.emnlp-main.1650.checklist.pdf