Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?

Olivier Ferret


Abstract
While contextual language models are now dominant in the field of Natural Language Processing, the representations they build at the token level are not always suitable for all uses. In this article, we propose a new method for building word or type-level embeddings from contextual models. This method combines the generalization and the aggregation of token representations. We evaluate it for a large set of English nouns from the perspective of the building of distributional thesauri for extracting semantic similarity relations. Moreover, we analyze the differences between static embeddings and type-level embeddings according to features such as the frequency of words or the type of semantic relations these embeddings account for, showing that the properties of these two types of embeddings can be complementary and exploited for further improving distributional thesauri.
Anthology ID:
2022.lrec-1.276
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2583–2590
Language:
URL:
https://aclanthology.org/2022.lrec-1.276
DOI:
Bibkey:
Cite (ACL):
Olivier Ferret. 2022. Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2583–2590, Marseille, France. European Language Resources Association.
Cite (Informal):
Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri? (Ferret, LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.lrec-1.276.pdf