CausalMinds at SemEval-2026 Task 12: Simple Fine-Tuning with Option Shuffling Outperforms Complex Pipelines for Abductive Event Reasoning

Vidur Gupta; Xiaofei Zhao; Jason Shaye

CausalMinds at SemEval-2026 Task 12: Simple Fine-Tuning with Option Shuffling Outperforms Complex Pipelines for Abductive Event Reasoning

Abstract

We describe our system for SemEval-2026 Task 12 on Abductive Event Reasoning, which requires identifying plausible direct cause(s) of real-world events. We conduct a systematic evaluation of 23 configurations spanning prompting, retrieval-augmented generation, multi-stage verification, and supervised fine-tuning across models of different scales. Across experiments, we found that fine-tuning GPT-4.1-mini with data augmentation via option shuffling consistently outperformed more complex multi-stage pipelines and larger-model prompting strategies. Our system scores 0.88 on the test dataset, ranking 19th out of 221 submissions, which is only 0.07 away from the highest scoring submission of 0.95. Interestingly, chain-of-thought prompting and multi-stage verification hurt performance compared to simpler baselines. This reinforces that simplicity can outperform complex pipelines. We document these negative results and examine the persistent gap between development (0.991) and test (0.88) scores.

Anthology ID:: 2026.semeval-1.358
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2852–2859
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.358/
DOI:
Bibkey:
Cite (ACL):: Vidur Gupta, Xiaofei Zhao, and Jason Shaye. 2026. CausalMinds at SemEval-2026 Task 12: Simple Fine-Tuning with Option Shuffling Outperforms Complex Pipelines for Abductive Event Reasoning. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 2852–2859, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: CausalMinds at SemEval-2026 Task 12: Simple Fine-Tuning with Option Shuffling Outperforms Complex Pipelines for Abductive Event Reasoning (Gupta et al., SemEval 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.358.pdf

PDF Cite Search Fix data