Embeddings in Natural Language Processing

Jose Camacho-Collados, Mohammad Taher Pilehvar


Abstract
Embeddings have been one of the most important topics of interest in NLP for the past decade. Representing knowledge through a low-dimensional vector which is easily integrable in modern machine learning models has played a central role in the development of the field. Embedding techniques initially focused on words but the attention soon started to shift to other forms. This tutorial will provide a high-level synthesis of the main embedding techniques in NLP, in the broad sense. We will start by conventional word embeddings (e.g., Word2Vec and GloVe) and then move to other types of embeddings, such as sense-specific and graph alternatives. We will finalize with an overview of the trending contextualized representations (e.g., ELMo and BERT) and explain their potential and impact in NLP.
Anthology ID:
2020.coling-tutorials.2
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: Tutorial Abstracts
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
10–15
Language:
URL:
https://aclanthology.org/2020.coling-tutorials.2
DOI:
10.18653/v1/2020.coling-tutorials.2
Bibkey:
Cite (ACL):
Jose Camacho-Collados and Mohammad Taher Pilehvar. 2020. Embeddings in Natural Language Processing. In Proceedings of the 28th International Conference on Computational Linguistics: Tutorial Abstracts, pages 10–15, Barcelona, Spain (Online). International Committee for Computational Linguistics.
Cite (Informal):
Embeddings in Natural Language Processing (Camacho-Collados & Pilehvar, COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.coling-tutorials.2.pdf