The Text Aphasia Battery (TAB): A Clinically-Grounded Benchmark for Aphasia-Like Deficits in Language Models

Nathan Roll, Jill Kries, Flora Jin, Catherine Wang, Ann Marie Finley, Meghan Sumner, Cory Shain, Laura Gwilliams


Abstract
Large language models (LLMs) have emerged as a candidate ‘model organism’ for human language, offering an unprecedented opportunity to study the computational basis of linguistic disorders like aphasia. However, traditional clinical assessments are ill-suited for LLMs, as they presuppose human-like pragmatic pressures and probe cognitive processes not inherent to artificial architectures. We introduce the Text Aphasia Battery (TAB), a text-only benchmark adapted from the Quick Aphasia Battery (QAB) to assess aphasic-like deficits in LLMs. The TAB comprises four subtests: Connected Text, Word Comprehension, Sentence Comprehension, and Repetition. This paper details the TAB’s design, subtests, and scoring criteria. To facilitate large-scale use, we validate an automated evaluation protocol using Gemini 2.5 Flash, which achieves reliability comparable to expert human raters (prevalence-weighted Cohen’s k=0.255 for model–consensus agreement vs. 0.286 for human–human agreement). We release TAB as a clinically-grounded, scalable framework for analyzing language deficits in artificial systems.
Anthology ID:
2026.clpsych-1.27
Volume:
Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Aya Zirikly, Kfir Bar, Sean MacAvaney, Molly Ireland, Yaakov Ophir, Dana Atzil-Slonim, Vasudha Varadarajan, Steven Bedrick, Bart Desmet
Venues:
CLPsych | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
340–354
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.clpsych-1.27/
DOI:
Bibkey:
Cite (ACL):
Nathan Roll, Jill Kries, Flora Jin, Catherine Wang, Ann Marie Finley, Meghan Sumner, Cory Shain, and Laura Gwilliams. 2026. The Text Aphasia Battery (TAB): A Clinically-Grounded Benchmark for Aphasia-Like Deficits in Language Models. In Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026), pages 340–354, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
The Text Aphasia Battery (TAB): A Clinically-Grounded Benchmark for Aphasia-Like Deficits in Language Models (Roll et al., CLPsych 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.clpsych-1.27.pdf