Wanzhao Zhang
2026
CICL26 at SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning
Wanzhao Zhang | Yue Yu
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Wanzhao Zhang | Yue Yu
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes our submission to SemEval-2026 Task 4 (Track A) on narrative similarity.The task requires systems to determine which of two candidate stories is more narratively similar to a given anchor story. While large language models (LLMs) demonstrate strong semantic reasoning abilities, their predictions in comparative settings can be sensitive to stochastic decoding and input order.We propose a lightweight inference-time cascade strategy that improves robustness without modifying the underlying model. Our approach combines self-consistency voting to reduce sampling variance,a swap-based symmetry test to mitigate positional bias, and a margin-based deterministic decision rule to resolve disagreements. This design explicitly leverages model uncertainty while maintaining reproducibility and simplicity.
2025
CICL at SemEval-2025 Task 9: A Pilot Study on Different Machine Learning Models for Food Hazard Detection Challenge
Weiting Wang | Wanzhao Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Weiting Wang | Wanzhao Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
This paper describes our approaches to SemEval-2025 task 9, a multiclass classification task to detect food hazards and affected products, given food incident reports from web resources. The training data consists of the date of the incidents and the text of the incident reports, as well as the labels: “hazard-category” and “product-category” for task 1, “hazard” and “product” for task 2. We primarily focused on solving task 1 of this challenge. Our approach is in two directions: Firstly, we fine-tuned BERT-based models (BERT and ModernBERT); secondly, in addition to BERT-based models, linearSVC, random forest classifier, and LightGBM were also used to tackle the challenge. From the experiment, we have learned that BERT-based models outperformed the other models mentioned above, and applying focal loss to BERT-based models optimized their performance on imbalanced classification tasks.