Surprisal Estimators for Human Reading Times Need Character Models

Byung-Doh Oh, Christian Clark, William Schuler


Abstract
While the use of character models has been popular in NLP applications, it has not been explored much in the context of psycholinguistic modeling. This paper presents a character model that can be applied to a structural parser-based processing model to calculate word generation probabilities. Experimental results show that surprisal estimates from a structural processing model using this character model deliver substantially better fits to self-paced reading, eye-tracking, and fMRI data than those from large-scale language models trained on much more data. This may suggest that the proposed processing model provides a more humanlike account of sentence processing, which assumes a larger role of morphology, phonotactics, and orthographic complexity than was previously thought.
Anthology ID:
2021.acl-long.290
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Editors:
Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3746–3757
Language:
URL:
https://aclanthology.org/2021.acl-long.290
DOI:
10.18653/v1/2021.acl-long.290
Bibkey:
Cite (ACL):
Byung-Doh Oh, Christian Clark, and William Schuler. 2021. Surprisal Estimators for Human Reading Times Need Character Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3746–3757, Online. Association for Computational Linguistics.
Cite (Informal):
Surprisal Estimators for Human Reading Times Need Character Models (Oh et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2021.acl-long.290.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2021.acl-long.290.mp4
Code
 byungdoh/acl21_semproc
Data
Natural StoriesPenn Treebank