Local-Global Vectors to Improve Unigram Terminology Extraction

Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, Farahnaz Faez


Abstract
The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand.
Anthology ID:
W16-4702
Volume:
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
CompuTerm
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
2–11
Language:
URL:
https://aclanthology.org/W16-4702
DOI:
Bibkey:
Cite (ACL):
Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, and Farahnaz Faez. 2016. Local-Global Vectors to Improve Unigram Terminology Extraction. In Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016), pages 2–11, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Local-Global Vectors to Improve Unigram Terminology Extraction (Amjadian et al., CompuTerm 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W16-4702.pdf