Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
York Hay Ng, Aditya Khan, Xiang Lu, Matteo Salloum, Michael Zhou, Phuong Hanh Hoang, A. Seza Doğruöz, En-Shiun Annie Lee
Abstract
Existing linguistic knowledge bases such as URIEL+ provide valuable geographic, genetic and typological distances for cross-lingual transfer but suffer from two key limitations. First, their one-size-fits-all vector representations are ill-suited to the diverse structures of linguistic data. Second, they lack a principled method for aggregating these signals into a single, comprehensive score. In this paper, we address these gaps by introducing a framework for type-matched language distances. We propose novel, structure-aware representations for each distance type: speaker-weighted distributions for geography, hyperbolic embeddings for genealogy, and a latent variables model for typology. We unify these signals into a robust, task-agnostic composite distance. Across multiple zero-shot transfer benchmarks, we demonstrate that our representations significantly improve transfer performance when the distance type is relevant to the task, while our composite distance yields gains in most tasks.- Anthology ID:
- 2026.eacl-srw.8
- Volume:
- Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Selene Baez Santamaria, Sai Ashish Somayajula, Atsuki Yamaguchi
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 110–130
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.8/
- DOI:
- Cite (ACL):
- York Hay Ng, Aditya Khan, Xiang Lu, Matteo Salloum, Michael Zhou, Phuong Hanh Hoang, A. Seza Doğruöz, and En-Shiun Annie Lee. 2026. Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 110–130, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+ (Ng et al., EACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.8.pdf