Abstract
This paper describes Sew-Embed, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2. We leverage the Wikipedia-based concept representations developed by Raganato et al. (2016), and propose an embedded augmentation of their explicit high-dimensional vectors, which we obtain by plugging in an arbitrary word (or sense) embedding representation, and computing a weighted average in the continuous vector space. We evaluate Sew-Embed with two different off-the-shelf embedding representations, and report their performances across all monolingual and cross-lingual benchmarks available for the task. Despite its simplicity, especially compared with supervised or overly tuned approaches, Sew-Embed achieves competitive results in the cross-lingual setting (3rd best result in the global ranking of subtask 2, score 0.56).- Anthology ID:
- S17-2041
- Volume:
- Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
- Month:
- August
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 261–266
- Language:
- URL:
- https://aclanthology.org/S17-2041
- DOI:
- 10.18653/v1/S17-2041
- Cite (ACL):
- Claudio Delli Bovi and Alessandro Raganato. 2017. Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 261–266, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia (Delli Bovi & Raganato, SemEval 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/S17-2041.pdf