JaeSeong Kim


2026

We investigate how multilingual representations emerge across depth in large language models.Using a unified probing framework, we analyze six multilingual LLMs across five languages (EN/ES/ZH/FR/DE), decomposing behavior into (i) early-layer dynamics, (ii) linear vs. MLP separability, and (iii) token–language alignment that tracks where vocabulary sharing peaks.Across models, we observe a consistent and substantial early jump: accuracy rises by +73.5 to +80.7 points from L0 to L1 on average, indicating that language-relevant signals become accessible immediately after the embedding layer.Moreover, representations are largely linearly separable: for 5/6 models, the mean gap between MLP and linear probes remains within ±0.5 points.Token–language alignment further reveals systematic structure, with peak vocabulary mass exceeding 48% in some models and substantial variation in the depth of peak sharing.These findings provide a compact, cross-model characterization of how multilingual information is organized across depth and introduce simple alignment metrics that complement accuracy-based evaluation.
Search
Co-authors
Venues
Fix author