YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT

Hong Jiang, Jin Wang, Xuejie Zhang


Abstract
This paper describes YNU-HPCC(Alias JH) team’s participation in the sub-task 2 of the SemEval-2025 Task 5, which requires fine-tuning language models to align subject tags with the TIBKAT collection. The task presents three key challenges: cross-disciplinary document coverage, bilingual (English-German) processing requirements, and extreme classification over 200,000 GND Subjects. To address these challenges, we apply a contrastive learning framework using multilingual Sentence-BERT models, implementing two innovative training strategies: mixed-negative multi-label sampling, and single-label sampling with random negative selection. Our best-performing model achieves significant improvements of 28.6% in average recall, reaching 0.2252 on the core-test set and 0.1677 on the all-test set. Notably, we reveal model architecture-dependent response patterns: MiniLM-series models benefit from multi-label training (+33.5% zero-shot recall), while mpnet variants excel with single-label approaches (+230.3% zero-shot recall). The study further demonstrates the effectiveness of contrastive learning for multilingual semantic alignment in low-resource scenarios, providing insights for extreme classification tasks.
Anthology ID:
2025.semeval-1.318
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2443–2448
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.318/
DOI:
Bibkey:
Cite (ACL):
Hong Jiang, Jin Wang, and Xuejie Zhang. 2025. YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2443–2448, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT (Jiang et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.318.pdf