DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

John Giorgi, Osvald Nitski, Bo Wang, Gary Bader


Abstract
Sentence embeddings are an important component of many natural language processing (NLP) systems. Like word embeddings, sentence embeddings are typically learned on large text corpora and then transferred to various downstream tasks, such as clustering and retrieval. Unlike word embeddings, the highest performing solutions for learning sentence embeddings require labelled data, limiting their usefulness to languages and domains where labelled data is abundant. In this paper, we present DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. Inspired by recent advances in deep metric learning (DML), we carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data. When used to extend the pretraining of transformer-based language models, our approach closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders. Importantly, our experiments suggest that the quality of the learned embeddings scale with both the number of trainable parameters and the amount of unlabelled training data. Our code and pretrained models are publicly available and can be easily adapted to new domains or used to embed unseen text.
Anthology ID:
2021.acl-long.72
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Editors:
Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
879–895
Language:
URL:
https://aclanthology.org/2021.acl-long.72
DOI:
10.18653/v1/2021.acl-long.72
Bibkey:
Cite (ACL):
John Giorgi, Osvald Nitski, Bo Wang, and Gary Bader. 2021. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 879–895, Online. Association for Computational Linguistics.
Cite (Informal):
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations (Giorgi et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2021.acl-long.72.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-2/2021.acl-long.72.mp4
Code
 JohnGiorgi/DeCLUTR +  additional community code
Data
GLUEMultiNLIOpenWebTextSNLISSTSST-2SST-5SentEvalWebText