Scalable Cross-Lingual Transfer of Neural Sentence Embeddings

Hanan Aldarmaki, Mona Diab


Abstract
We develop and investigate several cross-lingual alignment approaches for neural sentence embedding models, such as the supervised inference classifier, InferSent, and sequential encoder-decoder models. We evaluate three alignment frameworks applied to these models: joint modeling, representation transfer learning, and sentence mapping, using parallel text to guide the alignment. Our results support representation transfer as a scalable approach for modular cross-lingual alignment of neural sentence embeddings, where we observe better performance compared to joint models in intrinsic and extrinsic evaluations, particularly with smaller sets of parallel data.
Anthology ID:
S19-1006
Volume:
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Rada Mihalcea, Ekaterina Shutova, Lun-Wei Ku, Kilian Evang, Soujanya Poria
Venue:
*SEM
SIGs:
SIGSEM | SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/S19-1006
DOI:
10.18653/v1/S19-1006
Bibkey:
Cite (ACL):
Hanan Aldarmaki and Mona Diab. 2019. Scalable Cross-Lingual Transfer of Neural Sentence Embeddings. In Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019), pages 51–60, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Scalable Cross-Lingual Transfer of Neural Sentence Embeddings (Aldarmaki & Diab, *SEM 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/S19-1006.pdf
Data
SNLI