Abstract
We propose novel structural-based approaches for the generation and comparison of cross lingual sentence representations. We do so by applying geometric and topological methods to analyze the structure of sentences, as captured by their word embeddings. The key properties of our methods are”:” (a) They are designed to be isometric invariant, in order to provide language-agnostic representations. (b) They are fully unsupervised, and use no cross-lingual signal. The quality of our representations, and their preservation across languages, are evaluated in similarity comparison tasks, achieving competitive results. Furthermore, we show that our structural-based representations can be combined with existing methods for improved results.- Anthology ID:
- 2022.repl4nlp-1.18
- Volume:
- Proceedings of the 7th Workshop on Representation Learning for NLP
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Spandana Gella, He He, Bodhisattwa Prasad Majumder, Burcu Can, Eleonora Giunchiglia, Samuel Cahyawijaya, Sewon Min, Maximilian Mozes, Xiang Lorraine Li, Isabelle Augenstein, Anna Rogers, Kyunghyun Cho, Edward Grefenstette, Laura Rimell, Chris Dyer
- Venue:
- RepL4NLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 173–183
- Language:
- URL:
- https://aclanthology.org/2022.repl4nlp-1.18
- DOI:
- 10.18653/v1/2022.repl4nlp-1.18
- Cite (ACL):
- Shaked Haim Meirom and Omer Bobrowski. 2022. Unsupervised Geometric and Topological Approaches for Cross-Lingual Sentence Representation and Comparison. In Proceedings of the 7th Workshop on Representation Learning for NLP, pages 173–183, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Geometric and Topological Approaches for Cross-Lingual Sentence Representation and Comparison (Haim Meirom & Bobrowski, RepL4NLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.repl4nlp-1.18.pdf