Pasquale Grimaldi


2026

We describe our participation in SemEval-2026 Task 11, Subtask 1: determining the formal validity of syllogisms in English while minimizing the influence of content plausibility. Our system implements a neuro-symbolic pipeline that strictly separates neural and symbolic components. An LLM extracts the formal structure of natural-language syllogisms — proposition types (A, E, I, O) and the three terms — while the syllogistic figure is computed deterministically and a symbolic validator checks whether the resulting mood–figure pair belongs to the 24 classically valid Aristotelian forms. On the official evaluation we achieve 96.34% accuracy, Total Content Effect (TCE) of 1.02, and combined score of 56.57. Compared to pure-LLM baselines on the same backbone, our system more than doubles the combined score (from 26.52 to 56.57) and reduces TCE by nearly an order of magnitude. Swapping the extractor to Claude Sonnet 4.5 preserves combined score and TCE, confirming that content-invariance is contributed by the symbolic stage rather than any particular LLM. A paraphrase probe reveals that the validator is invariant to surface form but the extractor is sensitive to premise ordering — a specific, fixable limitation we identify as the primary target for future work.
Search
Co-authors
    Venues
    Fix author