Jorge Gómez-Navalón

2026

UMUTeam at SemEval-2026 Task 10: Transformer Ensembles for Conspiratorial Span Extraction and Detection
Jorge Gómez-Navalón | Ronghao Pan | Tomás Bernal-Beltrán | José Antonio García-Díaz | Rafael Valencia-Garcia
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

Conspiracy theories pose significant societal risks and require reliable automated detection methods. In this paper, we present our system for SemEval 2026 Task 10, addressing both conspiracy detection and psycholinguistic marker extraction. We leverage multiple pretrained transformer architectures and ensemble strategies to model conspiratorial discourse at both document and token levels. For classification, our ensemble achieves a weighted F1-score of 0.7688, indicating effective performance in distinguishing conspiratorial statements. For marker extraction, we formulate the task as a BIOES sequence labeling problem and enhance predictions through ensemble and specialist models. Our results highlight both the effectiveness of transformer-based approaches and the challenges of fine-grained conspiracy marker extraction.

pdf bib abs

UMUTeam at SemEval-2026 Task 6: Soft-Voting Transformer Ensembles for Detecting and Classifying Response Ambiguity in Political Discourse
Tomás Bernal-Beltrán | Ronghao Pan | Jorge Gómez-Navalón | José Antonio García-Díaz | Rafael Valencia-Garcia
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

Political discourse frequently involves strategically ambiguous responses, particularly in high-stakes settings such as presidential debates and interviews. Detecting whether a politician has directly answered a question, provided an ambiguous reply or issued a clear non-reply remains a challenging task due to the pragmatic and rhetorical nature of political language. This paper describes our participation in the SemEval 2026 CLARITY shared task on response ambiguity detection and classification in English. We focused exclusively on Task 1 (Clarity-level Classification) and proposed a weighted soft-voting ensemble that combines four fine-tuned encoder-only transformer models: RoBERTa-large, BERT-large-cased, DistilBERT-cased and ModernBERT-large. Each model was optimized through grid search and their predicted class probability distributions were aggregated using a weighted linear combination. On the official test set, our system achieved a macro-F1 score of 0.71, ranking 26th out of 41 participating teams. Even with the performance gap compared to top-ranked systems, our results demonstrate that a lightweight set of moderately sized encoder models can provide stable and competitive performance without relying on external data or large-scale architectures.

Co-authors

Venues

SemEval2
WS2

Fix author