Agyeya Singh Negi
2025
What if I ask in alia lingua? Measuring Functional Similarity Across Languages
Debangan Mishra
|
Arihant Rastogi
|
Agyeya Singh Negi
|
Shashwat Goel
|
Ponnurangam Kumaraguru
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
How similar are model outputs across languages? In this work, we study this question using a recently proposed model similarity metric—𝜅p—applied to 20 languages and 47 subjects in GlobalMMLU. Our analysis reveals that a model’s responses become increasingly consistent across languages as its size and capability grow. Interestingly, models exhibit greater cross-lingual consistency within themselves than agreement with other models prompted in the same language. These results highlight not only the value of 𝜅p as a practical tool for evaluating multilingual reliability, but also its potential to guide the development of more consistent multilingual systems.