Abstract
In this paper, we introduce an approach to combining word embeddings and machine translation for multilingual semantic word similarity, the task2 of SemEval-2017. Thanks to the unsupervised transliteration model, our cross-lingual word embeddings encounter decreased sums of OOVs. Our results are produced using only monolingual Wikipedia corpora and a limited amount of sentence-aligned data. Although relatively little resources are utilized, our system ranked 3rd in the monolingual subtask and can be the 6th in the cross-lingual subtask.- Anthology ID:
- S17-2033
- Volume:
- Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
- Month:
- August
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 220–225
- Language:
- URL:
- https://aclanthology.org/S17-2033
- DOI:
- 10.18653/v1/S17-2033
- Cite (ACL):
- Junqing He, Long Wu, Xuemin Zhao, and Yonghong Yan. 2017. HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 220–225, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity (He et al., SemEval 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/S17-2033.pdf