From Human Reading to NLM Understanding: Evaluating the Role of Eye-Tracking Data in Encoder-Based Models

Luca Dini, Lucia Domenichelli, Dominique Brunato, Felice Dell’Orletta


Abstract
Cognitive signals, particularly eye-tracking data, offer valuable insights into human language processing. Leveraging eye-gaze data from the Ghent Eye-Tracking Corpus, we conducted a series of experiments to examine how integrating knowledge of human reading behavior impacts Neural Language Models (NLMs) across multiple dimensions: task performance, attention mechanisms, and the geometry of their embedding space. We explored several fine-tuning methodologies to inject eye-tracking features into the models. Our results reveal that incorporating these features does not degrade downstream task performance, enhances alignment between model attention and human attention patterns, and compresses the geometry of the embedding space.
Anthology ID:
2025.acl-long.870
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17796–17813
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.870/
DOI:
Bibkey:
Cite (ACL):
Luca Dini, Lucia Domenichelli, Dominique Brunato, and Felice Dell’Orletta. 2025. From Human Reading to NLM Understanding: Evaluating the Role of Eye-Tracking Data in Encoder-Based Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 17796–17813, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
From Human Reading to NLM Understanding: Evaluating the Role of Eye-Tracking Data in Encoder-Based Models (Dini et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.870.pdf