AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Nikolaos Karafyllis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou


Abstract
We present a winning three-stage system for SemEval 2026 Task 12: Abductive Event Reasoning that combines graph-based retrieval, LLM-driven abductive reasoning with prompt design informed by reflective prompt evolution, and post-hoc consistency enforcement; our system ranks first on the evaluation-phase leaderboard with an accuracy score of 0.95. Cross-model error analysis across 14 models (7 families) reveals three shared inductive biases: causal chain incompleteness, proximate cause preference, and salience bias, whose cross-family convergence (51% cause-count reduction) indicates systematic rather than model-specific failure modes in multi-label causal reasoning.
Anthology ID:
2026.semeval-1.252
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1997–2019
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.252/
DOI:
Bibkey:
Cite (ACL):
Nikolaos Karafyllis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, and Giorgos Stamou. 2026. AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 1997–2019, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning (Karafyllis et al., SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.252.pdf