HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning

Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao


Abstract
As general large language models continue to advance, their real-world adaptation through effective fine-tuning remains a significant challenge. We introduce Hierarchical Multilevel Contrastive Learning (HMCL), a new contrastive learning framework that improves task-specific text representation for general models. HMCL integrates 3-level semantic differentiation (positive, weak-positive, and negative) and unifies contrastive learning, pair classification, and ranking objectives into a cohesive optimization strategy. HMCL demonstrates exceptional results across multi-domain and multilingual benchmarks, including text similarity, retrieval, reranking and Retrieval-Augmented Generation (RAG) tasks. It outperforms top unsupervised methods and supervised fine-tuning approaches while maintaining broad compatibility with architectures ranging from BERT to Qwen, 330M to 7B. In real-world merchant consultation scenarios, HMCL shows a 0.70-6.24 point improvement over original fine-tuning methods in large-scale base models. This establishes HMCL as a versatile solution that bridges the gap between general-purpose models and specialized industrial applications.
Anthology ID:
2025.findings-emnlp.727
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13495–13518
Language:
URL:
https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.727/
DOI:
10.18653/v1/2025.findings-emnlp.727
Bibkey:
Cite (ACL):
Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, and Yitao Cao. 2025. HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 13495–13518, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning (Wang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.727.pdf
Checklist:
 2025.findings-emnlp.727.checklist.pdf