Adele Henot-Mortier


2023

pdf
Do Language Models discriminate between relatives and pseudorelatives?
Adele Henot-Mortier
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

Large Language Models (LLMs) are often evaluated against massive benchmarks based on general-purpose tasks, which, despite being useful for concrete applications, tell us very little about the capacity of LLMs to learn specific and challenging aspects of the grammar. Here, we evaluate whether LLMs learn to identify a particular structure attested in Romance (and French in particular), called the pseudorelative. This structure, which is often surface-similar to a relative clause, is linked to robust syntactic and semantic restrictions. We present a series of experiments to test if LLMs pretrained on massive yet general corpora, manage to learn those various restrictions. Our results suggest that LLMs learn some but not all of these properties, but crucially fail at recognizing the most specific of them: cliticization.
Search
Co-authors
    Venues