Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings
Kanade Nonomura, Keita Fukushima, Risa Kondo, Tomoyuki Kajiwara
Abstract
We disentangle multilingual sentence embeddings into language-dependent and language-agnostic components, leveraging the latter to improve cross-lingual similarity estimation.Previous studies focused on encoder-based approaches that use only the input sentence; in contrast, this study examines the effectiveness of disentanglement methods across a broader range of sentence embeddings, including decoder-based approaches and those that utilize prompts.Experimental results demonstrate that embedding disentanglement is effective for a wide variety of sentence embeddings.- Anthology ID:
- 2026.acl-srw.102
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1169–1176
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-srw.102/
- DOI:
- Cite (ACL):
- Kanade Nonomura, Keita Fukushima, Risa Kondo, and Tomoyuki Kajiwara. 2026. Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1169–1176, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings (Nonomura et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-srw.102.pdf