Lending Eyesight to Language Models: Modeling and Probing Human scanpath through Transformer Decoder

Junlin Li; David Robert Reich; Yu-Yin Hsu

Lending Eyesight to Language Models: Modeling and Probing Human scanpath through Transformer Decoder

Junlin Li, David Robert Reich, Yu-Yin Hsu

Abstract

Human scanpaths offer rich and reliable clues about the cognitive mechanisms underlying language comprehension. Decoder-only language models, typically large language models (LLMs), have proven to exhibit striking parallels with human cognitive processes. In this study, we investigate to what extent language models can be endowed with human-like gaze shifts. Besides, by probing scanpath through eye model, analogous to probing language through language models, we ask whether such modeling can yield novel knowledge of the cognitive machinery of sense making.This study presents a novel plug-and-play module, EyeLM, to transform an autoregressive language model into an autoregressive eye model, thus facilitating a probabilistic spatial modeling of human explicit attention. Our EyeLM module, powered by LLMs, achieves competitive performance with novel cognitive probing capabilities. By probing EyeLM, we can reach the predictability and uncertainty of the scanpath. Exhibiting aligned patterns with prior knowledge about human reading comprehension, these probabilistic measures of scanpath act as promising predictors of human comprehension skills.

Anthology ID:: 2026.findings-acl.1591
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 31807–31819
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1591/
DOI:
Bibkey:
Cite (ACL):: Junlin Li, David Robert Reich, and Yu-Yin Hsu. 2026. Lending Eyesight to Language Models: Modeling and Probing Human scanpath through Transformer Decoder. In Findings of the Association for Computational Linguistics: ACL 2026, pages 31807–31819, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Lending Eyesight to Language Models: Modeling and Probing Human scanpath through Transformer Decoder (Li et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1591.pdf
Checklist:: 2026.findings-acl.1591.checklist.pdf

PDF Cite Search Checklist Fix data