Li Kloostra
2026
Lexical and Discourse Semantics in a Reading-time Corpus of English
Jakub Dotlacil | Laia Colina Fortuny | Li Kloostra | Johan Bos
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Jakub Dotlacil | Laia Colina Fortuny | Li Kloostra | Johan Bos
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We present a novel language resource that combines a reading-time corpus, constructed in psycholinguistics, with rich lexical, compositional, and discourse meaning representation annotations. While existing psycholinguistic corpora typically provide morphological and syntactic annotations, no comparable corpora with comprehensive semantic information have been made available until now. We enriched the UCL corpus (361 sentences of self-paced reading, eye-tracking, and EEG data) with annotations in the style of the Parallel Meaning Bank (PMB) project, including WordNet synsets, VerbNet thematic roles, Combinatory Categorial Grammar (CCG) parses, and Discourse Representation Theory (DRT) structures. We demonstrate the utility of this resource through two case studies examining (1) encoding interference effects due to gender similarity and (2) integration costs in semantic role assignment. Both studies reveal processing patterns consistent with established psycholinguistic theories and/or previous findings. This resource fills a significant gap in psycholinguistic research, enabling the evaluation of semantic processing theories on naturalistic corpus data and extending the existing pool of annotated reading-time corpora. It should be useful to psycholinguists, as well as to cognitive scientists interested in language processing.
2024
Using a Language Model to Unravel Semantic Development in Children’s Use of a Dutch Perception Verb
Bram van Dijk | Max J. van Duijn | Li Kloostra | Marco Spruit | Barend Beekhuizen
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024
Bram van Dijk | Max J. van Duijn | Li Kloostra | Marco Spruit | Barend Beekhuizen
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024
In this short paper we employ a Language Model (LM) to gain insight into how complex semantics of a Perception Verb (PV) emerge in children. Using a Dutch LM as representation of mature language use, we find that for all ages 1) the LM accurately predicts PV use in children’s freely-told narratives; 2) children’s PV use is close to mature use; 3) complex PV meanings with attentional and cognitive aspects can be found. Our approach illustrates how LMs can be meaningfully employed in studying language development, hence takes a constructive position in the debate on the relevance of LMs in this context.