Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis

Daoyang Li, Haiyan Zhao, Qingcheng Zeng, Mengnan Du


Abstract
Probing techniques for large language models (LLMs) have primarily focused on English, overlooking the vast majority of other world’s languages. In this paper, we extend these probing methods to a multilingual context, investigating how LLMs encode linguistic structures across diverse languages. We conduct experiments on several open-source LLM models, analyzing probing accuracy, trends across layers, and similarities between probing vectors for multiple languages. Our key findings reveal: (1) a consistent performance gap between high-resource and low-resource languages, with high-resource languages achieving significantly higher probing accuracy; (2) divergent layer-wise accuracy trends, where high-resource languages show substantial improvement in deeper layers similar to English; and (3) higher representational similarities among high-resource languages, with low-resource languages demonstrating lower similarities both among themselves and with high-resource languages. These results provide insights into how linguistic structures are represented differently across languages in LLMs and emphasize the need for improved structure modeling for low-resource languages.
Anthology ID:
2025.xllm-1.7
Volume:
Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Hao Fei, Kewei Tu, Yuhui Zhang, Xiang Hu, Wenjuan Han, Zixia Jia, Zilong Zheng, Yixin Cao, Meishan Zhang, Wei Lu, N. Siddharth, Lilja Øvrelid, Nianwen Xue, Yue Zhang
Venues:
XLLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
61–70
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.xllm-1.7/
DOI:
Bibkey:
Cite (ACL):
Daoyang Li, Haiyan Zhao, Qingcheng Zeng, and Mengnan Du. 2025. Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis. In Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025), pages 61–70, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis (Li et al., XLLM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.xllm-1.7.pdf