Kwaku Asare

2026

AI4PC-Howard University at SemEval-2026 Task 5: Calibrated Hybrid Ensembling and Retrieval-Augmented LLM Reasoning for Narrative Word-Sense Plausibility
Kwaku Asare | Saurav Aryal
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We present two complementary approaches for rating word-sense plausibility in SemEval-2026 Task 5 (literary homonyms in five-sentence stories). Approach 1 is a retrieve-then-generate pipeline using an open-weight Llama 3.1 70B Instruct model with structured reasoning and a self-correction pass. Approach 2 is a hybrid ensemble that combines API-based LLM prompting with transformer representations and a learned calibration layer trained on the development set. On the development set, Approach 2 achieves Spearman ρ = 0.7393 (p 10-102) with accuracy 0.8010 (471/588). Approach 1 achieves ρ = 0.5187 (p 10-65) with accuracy 0.6032 (561/930). We emphasize that Approach 1 does not exceed RoBERTabase in accuracy (0.6032 vs. 0.6410), but provides stronger rank correlation.

Co-authors

Saurav K. Aryal 1

Venues

SemEval1
WS1

Fix author