Abstract
Cross-lingual alignment, the meaningful similarity of representations across languages in multilingual language models, has been an active field of research in recent years. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field. We present different understandings of cross-lingual alignment and their limitations. We provide a qualitative summary of results from a number of surveyed papers. Finally, we discuss how these insights may be applied not only to encoder models, where this topic has been heavily studied, but also to encoder-decoder or even decoder-only models, and argue that an effective trade-off between language-neutral and language-specific information is key.- Anthology ID:
- 2024.findings-acl.649
- Volume:
- Findings of the Association for Computational Linguistics ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand and virtual meeting
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10922–10943
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.649
- DOI:
- Cite (ACL):
- Katharina Hämmerl, Jindřich Libovický, and Alexander Fraser. 2024. Understanding Cross-Lingual Alignment—A Survey. In Findings of the Association for Computational Linguistics ACL 2024, pages 10922–10943, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- Understanding Cross-Lingual Alignment—A Survey (Hämmerl et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.649.pdf