@inproceedings{stebakov-pershin-2025-well,
    title = "How Well Can {AI} Models Generate Human Eye Movements During Reading?",
    author = "Stebakov, Ivan  and
      Pershin, Ilya",
    editor = "Blodgett, Su Lin  and
      Curry, Amanda Cercas  and
      Dev, Sunipa  and
      Li, Siyan  and
      Madaio, Michael  and
      Wang, Jack  and
      Wu, Sherry Tongshuang  and
      Xiao, Ziang  and
      Yang, Diyi",
    booktitle = "Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.hcinlp-1.12/",
    pages = "148--162",
    ISBN = "979-8-89176-353-1",
    abstract = "Eye movement analysis has become an essential tool for studying cognitive processes in reading, serving both psycholinguistic research and natural language processing applications aimed at enhancing language model performance. However, the scarcity of eye-tracking data and its limited generalizability constrain data-driven approaches. Synthetic scanpath generation offers a potential solution to these limitations. While recent advances in scanpath generation show promise, current literature lacks systematic evaluation frameworks that comprehensively assess models' ability to reproduce natural reading gaze patterns. Existing studies often focus on isolated metrics rather than holistic evaluation of cognitive plausibility. This study presents a systematic evaluation of contemporary scanpath generation models, assessing their capacity to replicate natural reading behavior through comprehensive scanpath analysis. We demonstrate that while synthetic scanpath models successfully reproduce basic gaze patterns, significant limitations persist in capturing part-of-speech dependent gaze features and reading behaviors. Our cross-dataset comparison reveals performance degradation in three key areas: generalization across text genres, processing of long sentences, and reproduction of psycholinguistic effects. These findings underscore the need for more robust evaluation protocols and model architectures that better account for psycholinguistic complexity. Through detailed analysis of fixation sequences, durations, and reading patterns, we identify concrete pathways for developing more cognitively plausible scanpath generation models."
}Markdown (Informal)
[How Well Can AI Models Generate Human Eye Movements During Reading?](https://preview.aclanthology.org/ingest-emnlp/2025.hcinlp-1.12/) (Stebakov & Pershin, HCINLP 2025)
ACL