Ellat at SemEval-2026 Task 11: Comparing Encoder and Decoder Models for Syllogistic Reasoning

Farzaneh Bayan Memar, Hanneke Huls, Matthijs Ten Hove


Abstract
For SemEval-2026 Task 11 (Subtask 1: English), Team Ellat investigates whether language models can assess logical validity independently of semantic plausibility. Since these models learn statistical patterns instead of explicit logical rules, they often rely on world knowledge and semantic shortcuts rather than formal logic. To address this challenge, we evaluate three architectures: MiniLM-L6-mnli-binary, DeBERTa-v3-small, and Llama 3.1-8B-Instruct, applying task-specific fine-tuning for encoder models and Abstract Logic Augmentation with QLoRA for LLaMA. DeBERTa achieved the strongest overall performance, MiniLM showed clear reductions in content bias after fine-tuning, and Llama 3.1-8B exhibited strong plausibility bias in the zero-shot setting. However, our augmented fine-tuning approach led to only modest improvements and a partial shift toward structure-based reasoning. Overall, fine-tuning and abstraction-based augmentation reduce plausibility bias, but fully separating logical validity from semantic content remains challenging across architectures.
Anthology ID:
2026.semeval-1.270
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2130–2138
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.270/
DOI:
Bibkey:
Cite (ACL):
Farzaneh Bayan Memar, Hanneke Huls, and Matthijs Ten Hove. 2026. Ellat at SemEval-2026 Task 11: Comparing Encoder and Decoder Models for Syllogistic Reasoning. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 2130–2138, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Ellat at SemEval-2026 Task 11: Comparing Encoder and Decoder Models for Syllogistic Reasoning (Bayan Memar et al., SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.270.pdf
Supplementarymaterial:
 2026.semeval-1.270.SupplementaryMaterial.zip