Embedding Learning Through Multilingual Concept Induction

Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze


Abstract
We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual embedding learning performs better than previous approaches.
Anthology ID:
P18-1141
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1520–1530
Language:
URL:
https://aclanthology.org/P18-1141
DOI:
10.18653/v1/P18-1141
Bibkey:
Cite (ACL):
Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, and Hinrich Schütze. 2018. Embedding Learning Through Multilingual Concept Induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1520–1530, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Embedding Learning Through Multilingual Concept Induction (Dufter et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/P18-1141.pdf
Poster:
 P18-1141.Poster.pdf