Automated WordNet Construction Using Word Embeddings
Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora
Abstract
We present a fully unsupervised method for automated construction of WordNets based upon recent advances in distributional representations of sentences and word-senses combined with readily available machine translation tools. The approach requires very few linguistic resources and is thus extensible to multiple target languages. To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches. Our method exceeds the best language-specific and multi-lingual automated WordNets in F-score for both languages. The databases we construct for French and Russian, both languages without large publicly available manually constructed WordNets, will be publicly released along with the testsets.- Anthology ID:
- W17-1902
- Volume:
- Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Jose Camacho-Collados, Mohammad Taher Pilehvar
- Venue:
- SENSE
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12–23
- Language:
- URL:
- https://aclanthology.org/W17-1902
- DOI:
- 10.18653/v1/W17-1902
- Cite (ACL):
- Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, and Sanjeev Arora. 2017. Automated WordNet Construction Using Word Embeddings. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications, pages 12–23, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Automated WordNet Construction Using Word Embeddings (Khodak et al., SENSE 2017)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/W17-1902.pdf
- Code
- mkhodak/pawn