Localizing Events in Space: Comparing Humans and AI Models

Derrick Eui Gyu Kim, Kenneth Lai, James Pustejovsky


Abstract
Understanding how Large Language Models (LLMs) and Text-to-Image models (T2Is) acquire and apply implicit spatial knowledge remains an open challenge. In this paper, we present a novel dataset and evaluation framework designed to probe event localization capabilities in both humans, LLMs and T2Is. Our dataset includes 134 sentence pairs derived from Flickr30k captions, where explicit location information is systematically removed via Abstract Meaning Representation (AMR) parsing and manual refinement. Using this dataset, we analyze the effects of location ablation on spatial reasoning across human annotators, LLMs, and T2Is. Results show that while humans maintain robust location inferences after ablation, LLMs exhibit degraded performance, particularly for semantically polysemous verbs. T2Is demonstrate similar limitations, often generating visually inconsistent spatial contexts when locative cues are missing. Our findings highlight the gap between human and LLMs and T2Is in recovering implicit situational knowledge and suggest future directions for improving spatial reasoning in multimodal AI systems. This dataset contribution work serves as a proof-of-concept for systematic evaluation of implicit spatial reasoning and paves the way for larger-scale studies.
Anthology ID:
2026.lrec-main.865
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
11072–11084
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.865/
DOI:
Bibkey:
Cite (ACL):
Derrick Eui Gyu Kim, Kenneth Lai, and James Pustejovsky. 2026. Localizing Events in Space: Comparing Humans and AI Models. International Conference on Language Resources and Evaluation, main:11072–11084.
Cite (Informal):
Localizing Events in Space: Comparing Humans and AI Models (Kim et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.865.pdf