Claire E Stevenson

Also published as: Claire E. Stevenson


2026

In people, the ability to solve analogies such as “body: feet:: table: ?” emerges in childhood, and appears to transfer easily to other domains, such as the visual domain “(: ) :: < : ?”. Recent research shows that large language models (LLMs) can solve various forms of analogies. However, can LLMs generalize analogy solving to other domains like people can? To investigate this, we had children, adults, and LLMs solve a series of letter-string analogies (e.g., a b : a c :: j k : ?) in the Latin alphabet, in a near transfer domain (Greek alphabet), and a far transfer domain (list of symbols). Children and adults easily generalized their knowledge to unfamiliar domains, whereas LLMs did not. This key difference between human and AI performance is evidence that these LLMs still struggle with robust human-like analogical transfer.
Analogical reasoning is an essential aspect of human cognition. In this paper, we summarize key theories about the processes underlying analogical reasoning from the cognitive science literature and relate it to current research in natural language processing. While these processes can be easily linked to concepts in NLP, they are generally not viewed through a cognitive lens. Furthermore, we show how these notions are relevant for several major challenges in NLP research, not directly related to analogy solving. This may guide researchers to better optimize relational understanding in text, as opposed to relying heavily on entity-level similarity.

2025

Analogy-making lies at the heart of human cognition. Adults solve analogies such as horse belongs to stable like chicken belongs to …? by mapping relations (kept in) and answering chicken coop. In contrast, young children often use association, e.g., answering egg. This paper investigates whether large language models (LLMs) solve verbal analogies in A:B::C:? form using associations, similar to what children do. We use verbal analogies extracted from an online learning environment, where 14,006 7-12 year-olds from the Netherlands solved 872 analogies in Dutch. The eight tested LLMs performed at or above the level of children, with some models approaching adult performance estimates. However, when we control for solving by association this picture changes. We conclude that the LLMs we tested rely heavily on association like young children do. However, LLMs make different errors than children, and association doesn’t fully explain their superior performance on this children’s verbal analogy task. Future work will investigate whether LLMs associations and errors are more similar to adult relational reasoning.