Distilling Word Meaning in Context from Pre-trained Language Models

Yuki Arase†; Tomoyuki Kajiwara

doi:10.18653/v1/2021.findings-emnlp.49

Distilling Word Meaning in Context from Pre-trained Language Models

Abstract

In this study, we propose a self-supervised learning method that distils representations of word meaning in context from a pre-trained masked language model. Word representations are the basis for context-aware lexical semantics and unsupervised semantic textual similarity (STS) estimation. A previous study transforms contextualised representations employing static word embeddings to weaken excessive effects of contextual information. In contrast, the proposed method derives representations of word meaning in context while preserving useful context information intact. Specifically, our method learns to combine outputs of different hidden layers using self-attention through self-supervised learning with an automatically generated training corpus. To evaluate the performance of the proposed approach, we performed comparative experiments using a range of benchmark tasks. The results confirm that our representations exhibited a competitive performance compared to that of the state-of-the-art method transforming contextualised representations for the context-aware lexical semantic tasks and outperformed it for STS estimation.

Anthology ID:: 2021.findings-emnlp.49
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2021
Month:: November
Year:: 2021
Address:: Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: Findings
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 534–546
Language:
URL:: https://aclanthology.org/2021.findings-emnlp.49
DOI:: 10.18653/v1/2021.findings-emnlp.49
Bibkey:
Cite (ACL):: Yuki Arase and Tomoyuki Kajiwara. 2021. Distilling Word Meaning in Context from Pre-trained Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 534–546, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Distilling Word Meaning in Context from Pre-trained Language Models (Arase & Kajiwara, Findings 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2021.findings-emnlp.49.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-4/2021.findings-emnlp.49.mp4
Code: yukiar/distil_wic
Data: PAWS, WiC

PDF Search Code Video