NCLTeam at SemEval-2025 Task 10: Enhancing Multilingual, multi-class, and Multi-Label Document Classification via Contrastive Learning Augmented Cascaded UNet and Embedding based Approaches

Shu Li, George Williamson, Huizhi Liang


Abstract
The SemEval 2025 Task 10 Subtask2 presents a multi-task multi-label text classification challenge. The task requires systems to classify documents simultaneously across three distinct topics, the Climate Change(CC), the Ukraine Russia War(URW), and others. Several challenge were identified, including the instinct distinct of topics, the imbalance of categories, the insufficient samples, and the different distribution of develop set and test set. To address these challenges, two deep learning model have been implemented. One of the approach is the Contrastive learning augmented Cascaded UNet model(CCU), which employs a cascaded architecture to jointly process all subtasks. This model incorporates an UNet-style architecture to classify embeddings extracted by the base text encoder. A domain adaption method was implemented to facilitate joint learning across different document topics. We address the data insufficiency through contrastive learning and mitigate data imbalance using asymmetric loss function. We also implemented a shallow machine learning model. In this approach, transformer encoder models were applied to extract text embedding from various aspect, then deploy machine learning method to do the classification and compared with the base line. The UNet-style model achieves the highest f1 sample at 0.365 on the test set of 5th place compared with all approaches on leader board. Our source code developed for this paper are available at
Anthology ID:
2025.semeval-1.58
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
418–423
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.58/
DOI:
Bibkey:
Cite (ACL):
Shu Li, George Williamson, and Huizhi Liang. 2025. NCLTeam at SemEval-2025 Task 10: Enhancing Multilingual, multi-class, and Multi-Label Document Classification via Contrastive Learning Augmented Cascaded UNet and Embedding based Approaches. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 418–423, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
NCLTeam at SemEval-2025 Task 10: Enhancing Multilingual, multi-class, and Multi-Label Document Classification via Contrastive Learning Augmented Cascaded UNet and Embedding based Approaches (Li et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.58.pdf