T. Florian Jaeger

Also published as: Florian Jaeger

2026

Adaptive Speech Perception: Empirical Indeterminacy and a Path Forward
Shawn N. Cummings | T. Florian Jaeger | Chigusa Kurumada | Xin Xie
Proceedings of the Society for Computation in Linguistics 2026

Human listeners rapidly adapt to unfamiliar talkers, but the underlying computational mechanisms remain contested. Three candidate hypotheses—pre-linguistic normalization, changes in phonetic category representations, and changing decision biases—have largely been pursued in separation, using subfield-specific paradigms. Researchers working in these paradigms often assume that adaptivity observed in their particular paradigm can only be explained by one of the three mechanisms. We test this assumption for one of the most popular experimental paradigms (lexically-guided perceptual learning or LGPL) using a unified computational framework (ASP). We apply ASP to the largest existing LGPL data: 89,600 categorization responses from over 1000 listeners after lexically-guided exposure to 32 different stimulus sets. Despite the unprecedented scale of these data, we find that behavioral data are equally compatible with all three candidate mechanisms. We discuss how model-guided stimulus selection can increase the diagnosticity of future LGPL experiments. Our simulation code can easily be adapted to other experimental paradigms.

2019

pdf bib abs

Modeling Long-Distance Cue Integration in Spoken Word Recognition
Wednesday Bushong | T. Florian Jaeger
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Cues to linguistic categories are distributed across the speech signal. Optimal categorization thus requires that listeners maintain gradient representations of incoming input in order to integrate that information with later cues. There is now evidence that listeners can and do integrate cues that occur far apart in time. Computational models of this integration have however been lacking. We take a first step at addressing this gap by mathematically formalizing four models of how listeners may maintain and use cue information during spoken language understanding and test them on two perception experiments. In one experiment, we find support for rational integration of cues at long distances. In a second, more memory and attention-taxing experiment, we find evidence in favor of a switching model that avoids maintaining detailed representations of cues in memory. These results are a first step in understanding what kinds of mechanisms listeners use for cue integration under different memory and attentional constraints.

2017

pdf bib abs

Grounding sound change in ideal observer models of perception
Zachary Burchill | T. Florian Jaeger
Proceedings of the 7th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2017)

An important predictor of historical sound change, functional load, fails to capture insights from speech perception. Building on ideal observer models of word recognition, we devise a new definition of functional load that incorporates both a priori predictability and perceptual information. We explore this new measure with a simple model and find that it outperforms traditional measures.

We describe a new task-based corpus in the Spanish language. The corpus consists of videos, transcripts, and annotations of the inter- action between a naive speaker and a confederate listener. The speaker instructs the listener to MOVE, ROTATE, or PAINT objects on a computer screen. This resource can be used to study how participants produce instructions in a collaborative goal-oriented scenario, in Spanish. The data set is ideally suited for investigating incremental processes of the production and interpretation of language. We demonstrate here how to use this corpus to explore language-specific differences in utterance planning, for English and Spanish speakers.

2008

pdf bib abs

Production in a Multimodal Corpus: how Speakers Communicate Complex Actions
Carlos Gómez Gallo | T. Florian Jaeger | James Allen | Mary Swift
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe a new multimodal corpus currently under development. The corpus consists of videos of task-oriented dialogues that are annotated for speakers verbal requests and domain action executions. This resource provides data for new research on language production and comprehension. The corpus can be used to study speakers decisions as to how to structure their utterances given the complexity of the message they are trying to convey.