SumCSE: Summary as a transformation for Contrastive Learning
Raghuveer Thirukovalluru, Xiaolan Wang, Jun Chen, Shuyang Li, Jie Lei, Rong Jin, Bhuwan Dhingra
Abstract
Sentence embedding models are typically trained using contrastive learning (CL), either using human annotations directly or by repurposing other annotated datasets. In this work, we explore the recently introduced paradigm of generating CL data using generative language models (LM). In CL for computer vision (CV), compositional transformations (series of operations applied over an image. e.g. cropping + color distortion) which modify the input/image to retain minimal information were shown to be very effective. We show that composition of a ‘Summary’ transformation with diverse paraphrasing/contradicting transformations accomplishes the same and works very well in CL for sentence embeddings. Our final generated dataset (using Vicuna-13B) significantly outperforms the previous best unsupervised method (using ChatGPT) by 1.8 points, and SimCSE, a strong supervised baseline by 0.3 points on the semantic text similarity (STS) benchmark.- Anthology ID:
- 2024.findings-naacl.227
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2024
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3577–3588
- Language:
- URL:
- https://aclanthology.org/2024.findings-naacl.227
- DOI:
- Cite (ACL):
- Raghuveer Thirukovalluru, Xiaolan Wang, Jun Chen, Shuyang Li, Jie Lei, Rong Jin, and Bhuwan Dhingra. 2024. SumCSE: Summary as a transformation for Contrastive Learning. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 3577–3588, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- SumCSE: Summary as a transformation for Contrastive Learning (Thirukovalluru et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.findings-naacl.227.pdf