The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts

Sangmitra Madhusudan; Kaige Chen; Ali Emami

The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts

Sangmitra Madhusudan, Kaige Chen, Ali Emami

Abstract

When language models correctly parse "The cat that the dog chased meowed,” are they analyzing syntax or simply familiar with dogs chasing cats? Despite extensive benchmarking, we lack methods to distinguish structural understanding from semantic pattern matching. We introduce **CenterBench**, a dataset of 9,720 comprehension questions on center-embedded sentences (like "The cat [that the dog chased] meowed”) where relative clauses nest recursively, creating processing demands from simple to deeply nested structures. Each sentence has a syntactically identical but semantically implausible counterpart (e.g., mailmen prescribe medicine, doctors deliver mail) and six comprehension questions testing surface understanding, syntactic dependencies, and causal reasoning. Testing six models reveals that performance gaps between plausible and implausible sentences widen systematically with complexity, with models showing median gaps up to 26.8 percentage points, quantifying when they abandon structural analysis for semantic associations. Notably, semantic plausibility harms performance on questions about resulting actions, where following causal relationships matters more than semantic coherence. Reasoning models improve accuracy but their traces show semantic shortcuts, overthinking, and answer refusal. Unlike models whose plausibility advantage systematically widens with complexity, humans shows variable semantic effects. CenterBench provides the first framework to identify when models shift from structural analysis to pattern matching.

Anthology ID:: 2026.eacl-long.19
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 428–453
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.19/
DOI:
Bibkey:
Cite (ACL):: Sangmitra Madhusudan, Kaige Chen, and Ali Emami. 2026. The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 428–453, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts (Madhusudan et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.19.pdf

PDF Cite Search Fix data