Assessing the Agreement Competence of Large Language Models

Alba Táboas García, Leo Wanner


Abstract
While the competence of LLMs to cope with agreement constraints has been widely tested in English, only a very limited number of works deals with morphologically rich(er) languages. In this work, we experiment with 25 mono- and multilingual LLMs, applying them to a collection of more than 5,000 test examples that cover the main agreement phenomena in three Romance languages (Italian, Portuguese, and Spanish) and one Slavic Language (Russian). We identify which of the agreement phenomena are most difficult for which models and challenge some common assumptions of what makes a good model. The test suites into which the test examples are organized are openly available and can be easily adapted to other agreement phenomena and other languages for further research.
Anthology ID:
2025.depling-1.4
Volume:
Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Eva Hajičová, Sylvain Kahane
Venues:
DepLing | WS | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
36–53
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.depling-1.4/
DOI:
Bibkey:
Cite (ACL):
Alba Táboas García and Leo Wanner. 2025. Assessing the Agreement Competence of Large Language Models. In Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025), pages 36–53, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
Assessing the Agreement Competence of Large Language Models (Táboas García & Wanner, DepLing-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.depling-1.4.pdf