Individual corpora predict fast memory retrieval during reading

Markus J. Hofmann, Lara Müller, Andre Rölke, Ralph Radach, Chris Biemann


Abstract
The corpus, from which a predictive language model is trained, can be considered the experience of a semantic system. We recorded everyday reading of two participants for two months on a tablet, generating individual corpus samples of 300/500K tokens. Then we trained word2vec models from individual corpora and a 70 million-sentence newspaper corpus to obtain individual and norm-based long-term memory structure. To test whether individual corpora can make better predictions for a cognitive task of long-term memory retrieval, we generated stimulus materials consisting of 134 sentences with uncorrelated individual and norm-based word probabilities. For the subsequent eye tracking study 1-2 months later, our regression analyses revealed that individual, but not norm-corpus-based word probabilities can account for first-fixation duration and first-pass gaze duration. Word length additionally affected gaze duration and total viewing duration. The results suggest that corpora representative for an individual’s long-term memory structure can better explain reading performance than a norm corpus, and that recently acquired information is lexically accessed rapidly.
Anthology ID:
2020.cogalex-1.1
Volume:
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon
Month:
December
Year:
2020
Address:
Online
Venue:
CogALex
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/2020.cogalex-1.1
DOI:
Bibkey:
Cite (ACL):
Markus J. Hofmann, Lara Müller, Andre Rölke, Ralph Radach, and Chris Biemann. 2020. Individual corpora predict fast memory retrieval during reading. In Proceedings of the Workshop on the Cognitive Aspects of the Lexicon, pages 1–11, Online. Association for Computational Linguistics.
Cite (Informal):
Individual corpora predict fast memory retrieval during reading (Hofmann et al., CogALex 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.cogalex-1.1.pdf