Charles Locock, Lowcock or Lockhart? Offline Speech Translation: Test Suite for Named Entities
Maximilian Awiszus, Jan Niehues, Marco Turchi, Sebastian Stüker, Alex Waibel
Abstract
Generating rare words is a challenging task for natural language processing in general and in speech translation (ST) specifically. This paper introduces a test suite prepared for the Offline ST shared task at IWSLT. In the test suite, corresponding rare words (i.e. named entities) were annotated on TED-Talks for English and German and the English side was made available to the participants together with some distractors (irrelevant named entities). Our evaluation checks the capabilities of ST systems to leverage the information in the contextual list of named entities and improve translation quality. Systems are ranked based on the recall and precision of named entities (separately on person, location, and organization names) in the translated texts. Our evaluation shows that using contextual information improves translation quality as well as the recall and precision of NEs. The recall of organization names in all submissions is the lowest of all categories with a maximum of 87.5 % confirming the difficulties of ST systems in dealing with names.- Anthology ID:
- 2024.iwslt-1.35
- Volume:
- Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand (in-person and online)
- Editors:
- Elizabeth Salesky, Marcello Federico, Marine Carpuat
- Venue:
- IWSLT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 291–297
- Language:
- URL:
- https://aclanthology.org/2024.iwslt-1.35
- DOI:
- 10.18653/v1/2024.iwslt-1.35
- Cite (ACL):
- Maximilian Awiszus, Jan Niehues, Marco Turchi, Sebastian Stüker, and Alex Waibel. 2024. Charles Locock, Lowcock or Lockhart? Offline Speech Translation: Test Suite for Named Entities. In Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024), pages 291–297, Bangkok, Thailand (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- Charles Locock, Lowcock or Lockhart? Offline Speech Translation: Test Suite for Named Entities (Awiszus et al., IWSLT 2024)
- PDF:
- https://preview.aclanthology.org/autopr/2024.iwslt-1.35.pdf