Universal Patterns of Grammatical Gender in Multilingual Large Language Models

Andrea Schröter, Ali Basirat


Abstract
Grammatical gender is a fundamental linguistic feature that varies across languages, and its cross-linguistic correspondence has been a central question in disciplines such as cognitive science and linguistic typology. This study takes an information-theoretic approach to investigate the extent to which variational usable information about grammatical gender encoded by a large language model generalizes across languages belonging to different language families. Using mBERT as a case study, we analyze how grammatical gender is encoded and transferred across languages based on the usable information of the intermediate representations. The empirical results provide evidence that gender mechanisms are driven by abstract semantic features largely shared across languages, and that the information becomes more accessible at the higher layers of the language model.
Anthology ID:
2025.mrl-main.3
Volume:
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:
November
Year:
2025
Address:
Suzhuo, China
Editors:
David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:
MRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
34–46
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.3/
DOI:
Bibkey:
Cite (ACL):
Andrea Schröter and Ali Basirat. 2025. Universal Patterns of Grammatical Gender in Multilingual Large Language Models. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 34–46, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):
Universal Patterns of Grammatical Gender in Multilingual Large Language Models (Schröter & Basirat, MRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.3.pdf