Extended Abstract for “Linguistic Universals”: Emergent Shared Features in Independent Monolingual Language Models via Sparse Autoencoders

Ej Zhou, Suchir Salhan


Abstract
Do independently trained monolingual language models converge on shared linguistic principles? To explore this question, we propose to analyze a suite of models trained separately on single languages but with identical architectures and budgets. We train sparse autoencoders (SAEs) on model activations to obtain interpretable latent features, then align them across languages using activation correlations. We do pairwise analyses to see if feature spaces show non-trivial convergence, and we identify universal features that consistently emerge across diverse models. Positive results will provide evidence that certain high-level regularities in language are rediscovered independently in machine learning systems.
Anthology ID:
2025.mrl-main.9
Volume:
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:
November
Year:
2025
Address:
Suzhuo, China
Editors:
David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:
MRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
128–130
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.9/
DOI:
Bibkey:
Cite (ACL):
Ej Zhou and Suchir Salhan. 2025. Extended Abstract for “Linguistic Universals”: Emergent Shared Features in Independent Monolingual Language Models via Sparse Autoencoders. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 128–130, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):
Extended Abstract for “Linguistic Universals”: Emergent Shared Features in Independent Monolingual Language Models via Sparse Autoencoders (Zhou & Salhan, MRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.9.pdf