Abstract
In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations. After providing a brief history of supervised cross-lingual word representations, we focus on: 1) how to induce weakly-supervised and unsupervised cross-lingual word representations in truly resource-poor settings where bilingual supervision cannot be guaranteed; 2) critical examinations of different training conditions and requirements under which unsupervised algorithms can and cannot work effectively; 3) more robust methods for distant language pairs that can mitigate instability issues and low performance for distant language pairs; 4) how to comprehensively evaluate such representations; and 5) diverse applications that benefit from cross-lingual word representations (e.g., MT, dialogue, cross-lingual sequence labeling and structured prediction applications, cross-lingual IR).- Anthology ID:
- P19-4007
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 31–38
- Language:
- URL:
- https://aclanthology.org/P19-4007
- DOI:
- 10.18653/v1/P19-4007
- Cite (ACL):
- Sebastian Ruder, Anders Søgaard, and Ivan Vulić. 2019. Unsupervised Cross-Lingual Representation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, pages 31–38, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Cross-Lingual Representation Learning (Ruder et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/P19-4007.pdf