Zhuoying Li

2026

To assess homograph appropriateness in narrative contexts for SemEval-2026 Task 5, we propose a contrastive regression framework. This approach combines candidate sense definitions with full narrative texts to establish an MSE regression baseline, further enhanced by a contextual grouping ranking loss that models relative rationality among senses. Evaluated on the official AmbiStory dataset, our method consistently outperforms the baseline in accuracy and Spearman correlation. These results validate the efficacy of relative order modeling for capturing fine-grained semantic nuances in complex narratives. The code is available at: https://github.com/daojiaxu/Semeval2026task5.

pdf bib abs

This paper describes the system developed by the PEU Lab for SemEval-2026 Task 4, specifically focusing on Track A: Comparative Narrative Similarity. To address the pairwise nature of the task, a lightweight contrastive ranking approach is proposed. Specifically, the pretrained RoBERTa-Large model is utilized to encode the anchor and candidate stories. Rather than employing standard cross-entropy, a margin ranking loss is introduced, which allows the relative narrative proximity between different candidate stories to be explicitly modeled. Furthermore, a 5-fold cross-validation ensemble strategy is integrated to stabilize predictions on unseen data. Evaluated on the official dataset, the optimal configuration achieved an overall accuracy of 64.50%, demonstrating the effectiveness of relative order modeling. The code for this system is available at: https://github.com/mhchhh/SemEval2026-Task-4.

2025

pdf bib abs

This paper presents our research in the SemEval-2025 Task 9: Food Hazard Detection Challenge, with a focus on the application of ModernBERT for food safety data classification. We applied the ModernBERT model for the food hazard classification task, achieving a score of 0.7952 on the validation set and 0.7729 on the final test set, outperforming other models. Through comparative experiments with various deep learning architectures, we further confirmed the superiority of ModernBERT in food hazard detection. The results demonstrate the significant potential of ModernBERT in food safety management, providing strong support for its practical applications in the field. The code of this paper is available at: https://github.com/daojiaxu/semeval_2025_Task-9.

2024

pdf bib abs

Puer at SemEval-2024 Task 2: A BioLinkBERT Approach to Biomedical Natural Language Inference
Jiaxu Dao | Zhuoying Li | Xiuzhong Tang | Xiaoli Lan | Junde Wang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper delineates our investigation into the application of BioLinkBERT for enhancing clinical trials, presented at SemEval-2024 Task 2. Centering on the medical biomedical NLI task, our approach utilized the BioLinkBERT-large model, refined with a pioneering mixed loss function that amalgamates contrastive learning and cross-entropy loss. This methodology demonstrably surpassed the established benchmark, securing an impressive F1 score of 0.72 and positioning our work prominently in the field. Additionally, we conducted a comparative analysis of various deep learning architectures, including BERT, ALBERT, and XLM-RoBERTa, within the context of medical text mining. The findings not only showcase our method’s superior performance but also chart a course for future research in biomedical data processing. Our experiment source code is available on GitHub at: https://github.com/daojiaxu/semeval2024_task2.

pdf bib abs

Puer at SemEval-2024 Task 4: Fine-tuning Pre-trained Language Models for Meme Persuasion Technique Detection
Jiaxu Dao | Zhuoying Li | Youbang Su | Wensheng Gong
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

The paper summarizes our research on multilingual detection of persuasion techniques in memes for the SemEval-2024 Task 4. Our work focused on English-Subtask 1, implemented based on a roberta-large pre-trained model provided by the transforms tool that was fine-tuned into a corpus of social media posts. Our method significantly outperforms the officially released baseline method, and ranked 7th in English-Subtask 1 for the test set. This paper also compares the performances of different deep learning model architectures, such as BERT, ALBERT, and XLM-RoBERTa, on multilingual detection of persuasion techniques in memes. The experimental source code covered in the paper will later be sourced from Github.

Co-authors

Venues

SemEval5
WS3

Fix author