Nam Dinh

2026

One and Only at SemEval-2026 Task 2: Evaluating Zero-Shot Autonomous LLM Agents and Heuristic Proxies in Ecological Affect Forecasting
Nam Dinh
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

This paper presents team One and Only’s sys-tem for SemEval-2026 Task 2: PredictingVariation in Emotional Valence and Arousalover Time (Soni et al., 2026). We investigatewhether zero-shot LLM reasoning can replacefine-tuning for ecological affect forecasting bycombining deterministic statistical priors withfrozen LLMs (Gemini 3 Pro, Claude Opus4.6, GPT-5.2). For short-term state changes(Subtask 2A), an OLS mean-reversion anchoris paired with LLM-generated impulses; forlong-term disposition changes (Subtask 2B),a Chain-of-Thought prompt drives direct nu-meric prediction. Our system underperformsfine-tuned approaches on both subtasks. How-ever, post-submission ablation across threeLLMs reveals a task-dependent pattern: CoTreasoning substantially improves dispositionforecasting (rV : −0.185 → +0.129; MAEV :0.899 → 0.422), while uncalibrated LLM im-pulses degrade state-change prediction due tovariance collapse (σpred = 0.41 vs. σgold =1.73). We provide a detailed diagnostic anal-ysis of these failure modes and release allprompts and outputs for reproducibility.

Co-authors

Venues

SemEval1
WS1

Fix author