Bismarck Bamfo Odoom


2024

pdf
Speech Data from Radio Broadcasts for Low Resource Languages
Bismarck Bamfo Odoom | Leibny Paola Garcia Perera | Prangthip Hansanti | Loic Barrault | Christophe Ropers | Matthew Wiesner | Kenton Murray | Alexandre Mourachko | Philipp Koehn
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)

We created a collection of speech data for 48 low resource languages. The corpus is extracted from radio broadcasts and processed with novel speech detection and language identification models based on a manually vetted subset of the audio for 10 languages. The data is made publicly available.

2023

pdf
JHU IWSLT 2023 Multilingual Speech Translation System Description
Henry Li Xinyuan | Neha Verma | Bismarck Bamfo Odoom | Ujvala Pradeep | Matthew Wiesner | Sanjeev Khudanpur
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

We describe the Johns Hopkins ACL 60-60 Speech Translation systems submitted to the IWSLT 2023 Multilingual track, where we were tasked to translate ACL presentations from English into 10 languages. We developed cascaded speech translation systems for both the constrained and unconstrained subtracks. Our systems make use of pre-trained models as well as domain-specific corpora for this highly technical evaluation-only task. We find that the specific technical domain which ACL presentations fall into presents a unique challenge for both ASR and MT, and we present an error analysis and an ACL-specific corpus we produced to enable further work in this area.