Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN

Rémi de Vergnette; Maxime Amblard

Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN

Abstract

We evaluate large language models (LLMs) through semantic parsing into Yarn, a structured meaning representation that distinguishes predicate–argument structure from higher-level linguistic features such as tense, aspect, and modality. For evaluation, we employ SmatchY, a fine-grained metric designed to assess different layers of meaning independently. Our experiments test multiple LLMs under varied conditions, including inference modes, linearization formats (JSON and logic-inspired CFG), and the presence or absence of auxiliary supervision via partial semantic parses. Results show that model performance is highly sensitive to both representational design and supervision, with no single configuration consistently outperforming the others. While some models gain from additional semantic information in prompts, others are negatively affected. A layer-wise analysis indicates that surface-level features such as temporality and negation are captured more reliably than deeper semantic phenomena like quantification. Consistent with prior work, our findings highlight the limited capacity of current LLMs to generate fully formal meaning representations.

Anthology ID:: 2026.lrec-main.765
Volume:: Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:: May
Year:: 2026
Address:: Palma de Mallorca, Spain
Editors:: Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:: LREC
SIG:
Publisher:: ELRA Language Resource Association
Note:
Pages:: 9745–9755
Language:
URL:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.765/
DOI:
Bibkey:
Cite (ACL):: Rémi DE VERGNETTE and Maxime Amblard. 2026. Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN. International Conference on Language Resources and Evaluation, main:9745–9755.
Cite (Informal):: Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN (DE VERGNETTE & Amblard, LREC 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.765.pdf

PDF Cite Search Fix data