Coreference as an indicator of context scope in multimodal narrative

Nikolai Ilinykh; Shalom Lappin; Asad Sayeed; Sharid Loáiciga

Coreference as an indicator of context scope in multimodal narrative

Nikolai Ilinykh, Shalom Lappin, Asad B. Sayeed, Sharid Loáiciga

Abstract

We demonstrate that large multimodal language models differ substantially from humans in the distribution of coreferential expressions in a visual storytelling task. We introduce a number of metrics to quantify the characteristics of coreferential patterns in both human- and machine-written texts. Humans distribute coreferential expressions in a way that maintains consistency across texts and images, interleaving references to different entities in a highly varied way. Machines are less able to track mixed references, despite achieving perceived improvements in generation quality. Materials, metrics, and code for our study are available at https://github.com/GU-CLASP/coreference-context-scope.

Anthology ID:: 2025.gem-1.67
Volume:: Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Month:: July
Year:: 2025
Address:: Vienna, Austria and virtual meeting
Editors:: Kaustubh Dhole, Miruna Clinciu
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 789–807
Language:
URL:: https://preview.aclanthology.org/transition-to-people-yaml/2025.gem-1.67/
DOI:
Bibkey:
Cite (ACL):: Nikolai Ilinykh, Shalom Lappin, Asad B. Sayeed, and Sharid Loáiciga. 2025. Coreference as an indicator of context scope in multimodal narrative. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 789–807, Vienna, Austria and virtual meeting. Association for Computational Linguistics.
Cite (Informal):: Coreference as an indicator of context scope in multimodal narrative (Ilinykh et al., GEM 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/transition-to-people-yaml/2025.gem-1.67.pdf

PDF Cite Search Fix data