Abstract
This paper describes a simple but competitive unsupervised system for hypernym discovery. The system uses skip-gram word embeddings with negative sampling, trained on specialised corpora. Candidate hypernyms for an input word are predicted based based on cosine similarity scores. Two sets of word embedding models were trained separately on two specialised corpora: a medical corpus and a music industry corpus. Our system scored highest in the medical domain among the competing unsupervised systems but performed poorly on the music industry domain. Our system does not depend on any external data other than raw specialised corpora.- Anthology ID:
- S18-1151
- Volume:
- Proceedings of the 12th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 924–927
- Language:
- URL:
- https://aclanthology.org/S18-1151
- DOI:
- 10.18653/v1/S18-1151
- Cite (ACL):
- Alfredo Maldonado and Filip Klubička. 2018. ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 924–927, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora (Maldonado & Klubička, SemEval 2018)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/S18-1151.pdf
- Data
- SemEval-2018 Task 9: Hypernym Discovery