Lu Gehao

2026

zhangpeng at SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization
Zhang Peng | Lu Gehao
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

This paper presents our system developed for the SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization. on Subtask 1: Multilingual Text Classification Challenge - Polarization Detection. on Subtask 2: Multilingual Text Classification Challenge - Polarization Type Classification. on Subtask 3: Multilingual Text Classification Challenge - Manifestation Identification. For Subtask 1, we explored classical text representation approaches including Bag-of-Words, Word2Vec Average Vectors, and Bag-of-Centroids. Among these methods, the Bag-of-Centroids model achieved the best performance on both development and test datasets. For Subtask 2 and Subtask 3, we fine-tuned four different pre-trained language models: google-bert, FacebookAI-roberta, dccuchile-bert, and distilbert-multi. We experiment with 1) the training set data is analyzed visually, 2) multiple numbers of single models are trained on the training set data, and 3) multiple number of single models for voting weight ensemble learning. We further study the influence of different hyperparameters on the integrated model and select the best integration model for the prediction of the test set. On the official test set, our system achieved Macro-F1 scores of 0.6882 (EN) and 0.6711 (SP) for Subtask 1, 0.3752 (EN) and 0.6386 (SP) for Subtask 2, and 0.3561 (EN) and 0.4366 (SP) for Subtask 3. For the final ranking, organizers will use the Macro F1 score. These approachs has yielded good results.

pdf bib abs

zhangpeng at SemEval-2026 Task 10: PsyCoMark - Psycholinguistic Conspiracy Marker Extraction and Detection
Zhang Peng | Lu Gehao
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We describe our system for SemEval-2026 Task 10 on psycholinguistic conspiracy marker extraction and conspiracy detection from English texts. The shared task consists of two subtasks: (1) extracting conspiracy-related markers—actor, action, effect, victim, and evidence—evaluated using an overlap-based macro F1-score, and (2) detecting conspiracy content as a binary text classification problem evaluated using macro-averaged F1-score. Our approach relies on fine-tuning pre-trained transformer encoders, including multilingual DistilBERT variants and DeBERTa-v3, without using external corpora or data augmentation techniques. Experimental results show that our best models achieve a macro-F1 score of 0.1476 for Subtask~1 and a Weighted-F1 score of 0.7267 for Subtask~2. These results show that simple fine-tuning of pre-trained models provides a strong baseline for both marker extraction and conspiracy detection.

Co-authors

Zhang Peng 2

Venues

SemEval2
WS2

Fix author