Jorge Gómez-Navalón


2026

Conspiracy theories pose significant societal risks and require reliable automated detection methods. In this paper, we present our system for SemEval 2026 Task 10, addressing both conspiracy detection and psycholinguistic marker extraction. We leverage multiple pretrained transformer architectures and ensemble strategies to model conspiratorial discourse at both document and token levels. For classification, our ensemble achieves a weighted F1-score of 0.7688, indicating effective performance in distinguishing conspiratorial statements. For marker extraction, we formulate the task as a BIOES sequence labeling problem and enhance predictions through ensemble and specialist models. Our results highlight both the effectiveness of transformer-based approaches and the challenges of fine-grained conspiracy marker extraction.
Political discourse frequently involves strategically ambiguous responses, particularly in high-stakes settings such as presidential debates and interviews. Detecting whether a politician has directly answered a question, provided an ambiguous reply or issued a clear non-reply remains a challenging task due to the pragmatic and rhetorical nature of political language. This paper describes our participation in the SemEval 2026 CLARITY shared task on response ambiguity detection and classification in English. We focused exclusively on Task 1 (Clarity-level Classification) and proposed a weighted soft-voting ensemble that combines four fine-tuned encoder-only transformer models: RoBERTa-large, BERT-large-cased, DistilBERT-cased and ModernBERT-large. Each model was optimized through grid search and their predicted class probability distributions were aggregated using a weighted linear combination. On the official test set, our system achieved a macro-F1 score of 0.71, ranking 26th out of 41 participating teams. Even with the performance gap compared to top-ranked systems, our results demonstrate that a lightweight set of moderately sized encoder models can provide stable and competitive performance without relying on external data or large-scale architectures.