Mario Corrales-Astorgano


2026

This paper explores the use of an annotated speech corpus to assess multiple dimensions of speech quality—particularly phonetic, fluency and prosody—in individuals with Down syndrome, with the aim of informing the development of automated assessment tools. We conducted a series of experiments using the GOPT model, together with representations extracted from fine-tuning Wav2Vec models focused on phoneme classification. Model predictions were compared against expert annotations from a speech-language pathologist using Pearson correlation. Results demonstrate significant improvements over prior work, with correlations up to 0.49 in certain aspects, particularly for phonetic and fluency dimensions, while prosody remained more challenging to model. The study highlights the potential of Transformer-based architectures for atypical speech assessment and underscores the challenges inherent in assessing atypical speech, particularly due to variability linked to specific disfluency types.

2016

This paper describes the recording of a speech corpus focused on prosody of people with intellectual disabilities. To do this, a video game is used with the aim of improving the user’s motivation. Moreover, the player’s profiles and the sentences recorded during the game sessions are described. With the purpose of identifying the main prosodic troubles of people with intellectual disabilities, some prosodic features are extracted from recordings, like fundamental frequency, energy and pauses. After that, a comparison is made between the recordings of people with intellectual disabilities and people without intellectual disabilities. This comparison shows that pauses are the best discriminative feature between these groups. To check this, a study has been done using machine learning techniques, with a classification rate superior to 80%.