SceneGram: Conceptualizing and Describing Tangrams in Scene Context

Simeon Junker, Sina Zarrieß


Abstract
Research on reference and naming suggests that humans can come up with very different ways of conceptualizing and referring to the same object, e.g. the same abstract tangram shape can be a “crab”, “sink” or “space ship”. Another common assumption in cognitive science is that scene context fundamentally shapes our visual perception of objects and conceptual expectations. This paper contributes SceneGram, a dataset of human references to tangram shapes placed in different scene contexts, allowing for systematic analyses of the effect of scene context on conceptualization. Based on this data, we analyze references to tangram shapes generated by multimodal LLMs, showing that these models do not account for the richness and variability of conceptualizations found in human references.
Anthology ID:
2025.findings-acl.1229
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23976–23992
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.1229/
DOI:
Bibkey:
Cite (ACL):
Simeon Junker and Sina Zarrieß. 2025. SceneGram: Conceptualizing and Describing Tangrams in Scene Context. In Findings of the Association for Computational Linguistics: ACL 2025, pages 23976–23992, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
SceneGram: Conceptualizing and Describing Tangrams in Scene Context (Junker & Zarrieß, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.1229.pdf