HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning
Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao
Abstract
As general large language models continue to advance, their real-world adaptation through effective fine-tuning remains a significant challenge. We introduce Hierarchical Multilevel Contrastive Learning (HMCL), a new contrastive learning framework that improves task-specific text representation for general models. HMCL integrates 3-level semantic differentiation (positive, weak-positive, and negative) and unifies contrastive learning, pair classification, and ranking objectives into a cohesive optimization strategy. HMCL demonstrates exceptional results across multi-domain and multilingual benchmarks, including text similarity, retrieval, reranking and Retrieval-Augmented Generation (RAG) tasks. It outperforms top unsupervised methods and supervised fine-tuning approaches while maintaining broad compatibility with architectures ranging from BERT to Qwen, 330M to 7B. In real-world merchant consultation scenarios, HMCL shows a 0.70-6.24 point improvement over original fine-tuning methods in large-scale base models. This establishes HMCL as a versatile solution that bridges the gap between general-purpose models and specialized industrial applications.- Anthology ID:
- 2025.findings-emnlp.727
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13495–13518
- Language:
- URL:
- https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.727/
- DOI:
- 10.18653/v1/2025.findings-emnlp.727
- Cite (ACL):
- Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, and Yitao Cao. 2025. HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 13495–13518, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning (Wang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.727.pdf