HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity

Junqing He, Long Wu, Xuemin Zhao, Yonghong Yan


Abstract
In this paper, we introduce an approach to combining word embeddings and machine translation for multilingual semantic word similarity, the task2 of SemEval-2017. Thanks to the unsupervised transliteration model, our cross-lingual word embeddings encounter decreased sums of OOVs. Our results are produced using only monolingual Wikipedia corpora and a limited amount of sentence-aligned data. Although relatively little resources are utilized, our system ranked 3rd in the monolingual subtask and can be the 6th in the cross-lingual subtask.
Anthology ID:
S17-2033
Volume:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
220–225
Language:
URL:
https://aclanthology.org/S17-2033
DOI:
10.18653/v1/S17-2033
Bibkey:
Cite (ACL):
Junqing He, Long Wu, Xuemin Zhao, and Yonghong Yan. 2017. HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 220–225, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity (He et al., SemEval 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/S17-2033.pdf