Rhythmic Proximity Between Natives And Learners Of French - Evaluation of a metric based on the CEFC corpus

Sylvain Coulange, Solange Rossato


Abstract
This work aims to better understand the role of rhythm in foreign accent, and its modelling. We made a model of rhythm in French taking into account its variability, thanks to the Corpus pour l’Étude du Français Contemporain (CEFC), which contains up to 300 hours of speech of a wide variety of speaker profiles and situations. 16 parameters were computed, each of them being based on segment duration, such as voicing and intersyllabic timing. All the parameters are fully automatically detected from signal, without ASR or transcription. A gaussian mixture model was trained on 1,340 native speakers of French; any 30-second minimum speech may be computed to get the probability of its belonging to this model. We tested it with 146 test native speakers (NS), 37 non-native speakers (NNS) from the same corpus, and 29 non-native Japanese learners of French (JpNNS) from an independent corpus. The probability of NNS having inferior log-likelihood to NS was only a tendency (p=.067), maybe due to the heterogeneity of French proficiency of the speakers; but a much bigger probability was obtained for JpNNS (p<.0001), where all speakers were A2 level. Eta-squared test showed that most efficient parameters were intersyllabic mean duration and variation coefficient, along with speech rate for NNS; and speech rate and phonation ratio for JpNNS.
Anthology ID:
2020.lrec-1.304
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2497–2502
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.304
DOI:
Bibkey:
Cite (ACL):
Sylvain Coulange and Solange Rossato. 2020. Rhythmic Proximity Between Natives And Learners Of French - Evaluation of a metric based on the CEFC corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2497–2502, Marseille, France. European Language Resources Association.
Cite (Informal):
Rhythmic Proximity Between Natives And Learners Of French - Evaluation of a metric based on the CEFC corpus (Coulange & Rossato, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.lrec-1.304.pdf