Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology
Elizabeth Salesky, Kareem Darwish, Mohamed Al-Badrashiny, Mona Diab, Jan Niehues
Abstract
We present the ACL 60/60 evaluation sets for multilingual translation of ACL 2022 technical presentations into 10 target languages. This dataset enables further research into multilingual speech translation under realistic recording conditions with unsegmented audio and domain-specific terminology, applying NLP tools to text and speech in the technical domain, and evaluating and improving model robustness to diverse speaker demographics.- Anthology ID:
- 2023.iwslt-1.2
- Volume:
- Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada (in-person and online)
- Editors:
- Elizabeth Salesky, Marcello Federico, Marine Carpuat
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 62–78
- Language:
- URL:
- https://aclanthology.org/2023.iwslt-1.2
- DOI:
- 10.18653/v1/2023.iwslt-1.2
- Cite (ACL):
- Elizabeth Salesky, Kareem Darwish, Mohamed Al-Badrashiny, Mona Diab, and Jan Niehues. 2023. Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 62–78, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology (Salesky et al., IWSLT 2023)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2023.iwslt-1.2.pdf