Translating Dislocations or Parentheticals : Investigating the Role of Prosodic Boundaries for Spoken Language Translation of French into English

Nicolas Ballier, Behnoosh Namdarzadeh, Maria Zimina, Jean-Baptiste Yunès


Abstract
This paper examines some of the effects of prosodic boundaries on ASR outputs and Spoken Language Translations into English for two competing French structures (“c’est” dislocation vs. “c’est” parentheticals). One native speaker of French read 104 test sentences that were then submitted to two systems. We compared the outputs of two toolkits, SYSTRAN Pure Neural Server (SPNS9) (Crego et al., 2016) and Whisper. For SPNS9, we compared the translation of the text file used for the reading with the translation of the transcription generated through Vocapia ASR. We also tested the transcription engine for speech recognition uploading an MP3 file and used the same procedure for AI Whisper’s Web-scale Supervised Pretraining for Speech Recognition system (Radford et al., 2022). We reported WER for the transcription tasks and the BLEU scores for the different models. We evidenced the variability of the punctuation in the ASR outputs and discussed it in relation to the duration of the utterance. We discussed the effects of the prosodic boundaries. We described the status of the boundary in the speech-to-text systems, discussing the consequence for the neural machine translation of the rendering of the prosodic boundary by a comma, a full stop, or any other punctuation symbol. We used the reference transcript of the reading phase to compute the edit distance between the reference transcript and the ASR output. We also used textometric analyses with iTrameur (Fleury and Zimina, 2014) for insights into the errors that can be attributed to ASR or to Neural Machine translation.
Anthology ID:
2023.mtsummit-users.11
Volume:
Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Masaru Yamada, Felix do Carmo
Venue:
MTSummit
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
119–131
Language:
URL:
https://aclanthology.org/2023.mtsummit-users.11
DOI:
Bibkey:
Cite (ACL):
Nicolas Ballier, Behnoosh Namdarzadeh, Maria Zimina, and Jean-Baptiste Yunès. 2023. Translating Dislocations or Parentheticals : Investigating the Role of Prosodic Boundaries for Spoken Language Translation of French into English. In Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track, pages 119–131, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
Translating Dislocations or Parentheticals : Investigating the Role of Prosodic Boundaries for Spoken Language Translation of French into English (Ballier et al., MTSummit 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2023.mtsummit-users.11.pdf