Ambika Kirkland


2026

Speech disfluencies have been shown to impact both judgments about a speaker’s competence and decisions about which source of information to rely on. However, fluency effects more broadly are highly sensitive to context: they are strongest when there is little other information available to inform judgments and decisions, and can be attenuated or even reversed by metacognitive processes. Speech is generally experienced in the context of interactions, where listeners have access to a plethora of information about the speaker and other parameters relevant to decision-making. It is hence crucial to consider how the outcomes of studies on speech disfluencies might be impacted by the framing of experimental tasks and the information available to participants. We carried out a decision-making task where participants had to choose which of two speakers, one fluent and one disfluent, had answered a trivia question correctly. The task was presented in the context of three scenarios which provided different information about the speakers. We replicated previous findings that listeners preferred fluent answers in only one of these three contexts, demonstrating the importance of task framing.

2022

As part of the PSST challenge, we explore how data augmentations, data sources, and model size affect phoneme transcription accuracy on speech produced by individuals with aphasia. We evaluate model performance in terms of feature error rate (FER) and phoneme error rate (PER). We find that data augmentations techniques, such as pitch shift, improve model performance. Additionally, increasing the size of the model decreases FER and PER. Our experiments also show that adding manually-transcribed speech from non-aphasic speakers (TIMIT) improves performance when Room Impulse Response is used to augment the data. The best performing model combines aphasic and non-aphasic data and has a 21.0% PER and a 9.2% FER, a relative improvement of 9.8% compared to the baseline model on the primary outcome measurement. We show that data augmentation, larger model size, and additional non-aphasic data sources can be helpful in improving automatic phoneme recognition models for people with aphasia.