Word Embedding Evaluation and Combination
Sahar Ghannay, Benoit Favre, Yannick Estève, Nathalie Camelin
Abstract
Word embeddings have been successfully used in several natural language processing tasks (NLP) and speech processing. Different approaches have been introduced to calculate word embeddings through neural networks. In the literature, many studies focused on word embedding evaluation, but for our knowledge, there are still some gaps. This paper presents a study focusing on a rigorous comparison of the performances of different kinds of word embeddings. These performances are evaluated on different NLP and linguistic tasks, while all the word embeddings are estimated on the same training data using the same vocabulary, the same number of dimensions, and other similar characteristics. The evaluation results reported in this paper match those in the literature, since they point out that the improvements achieved by a word embedding in one task are not consistently observed across all tasks. For that reason, this paper investigates and evaluates approaches to combine word embeddings in order to take advantage of their complementarity, and to look for the effective word embeddings that can achieve good performances on all tasks. As a conclusion, this paper provides new perceptions of intrinsic qualities of the famous word embedding families, which can be different from the ones provided by works previously published in the scientific literature.- Anthology ID:
- L16-1046
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 300–305
- Language:
- URL:
- https://aclanthology.org/L16-1046
- DOI:
- Cite (ACL):
- Sahar Ghannay, Benoit Favre, Yannick Estève, and Nathalie Camelin. 2016. Word Embedding Evaluation and Combination. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 300–305, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Word Embedding Evaluation and Combination (Ghannay et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/L16-1046.pdf
- Data
- Penn Treebank