Richard L. Lewis

Also published as: Richard Lewis


Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention
Soo Hyun Ryu | Richard Lewis
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

We advance a novel explanation of similarity-based interference effects in subject-verb and reflexive pronoun agreement processing, grounded in surprisal values computed from a pretrained large-scale Transformer model, GPT-2. Specifically, we show that surprisal of the verb or reflexive pronoun predicts facilitatory interference effects in ungrammatical sentences, where a distractor noun that matches in number with the verb or pronouns leads to faster reading times, despite the distractor not participating in the agreement relation. We review the human empirical evidence for such effects, including recent meta-analyses and large-scale studies. We also show that attention patterns (indexed by entropy and other measures) in the Transformer show patterns of diffuse attention in the presence of similar distractors, consistent with cue-based retrieval models of parsing. But in contrast to these models, the attentional cues and memory representations are learned entirely from the simple self-supervised task of predicting the next word.


A Modeling Study of the Effects of Surprisal and Entropy in Perceptual Decision Making of an Adaptive Agent
Pyeong Whan Cho | Richard Lewis
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Processing difficulty in online language comprehension has been explained in terms of surprisal and entropy reduction. Although both hypotheses have been supported by experimental data, we do not fully understand their relative contributions on processing difficulty. To develop a better understanding, we propose a mechanistic model of perceptual decision making that interacts with a simulated task environment with temporal dynamics. The proposed model collects noisy bottom-up evidence over multiple timesteps, integrates it with its top-down expectation, and makes perceptual decisions, producing processing time data directly without relying on any linking hypothesis. Temporal dynamics in the task environment was determined by a simple finite-state grammar, which was designed to create the situations where the surprisal and entropy reduction hypotheses predict different patterns. After the model was trained to maximize rewards, the model developed an adaptive policy and both surprisal and entropy effects were observed especially in a measure reflecting earlier processing.


Dynamic encoding of structural uncertainty in gradient symbols
Pyeong Whan Cho | Matthew Goldrick | Richard L. Lewis | Paul Smolensky
Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018)


pdf bib
Computationally Rational Saccadic Control: An Explanation of Spillover Effects Based on Sampling from Noisy Perception and Memory
Michael Shvartsman | Richard Lewis | Satinder Singh
Proceedings of the Fifth Workshop on Cognitive Modeling and Computational Linguistics


Modeling Sentence Processing in ACT-R
Shravan Vasishth | Richard L. Lewis
Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together