Abstract
This paper describes the submission of a high-quality translation of the OLDI Seed datasetinto Italian for the WMT 2023 Open LanguageData Initiative shared task.The base of this submission is a previous ver-sion of an Italian OLDI Seed dataset releasedby Haberland et al. (2024) via machine trans-lation and partial post-editing. This data wassubsequently reviewed in its entirety by twonative speakers of Italian, who carried out ex-tensive post-editing with particular attention tothe idiomatic translation of named entities.- Anthology ID:
- 2024.wmt-1.43
- Volume:
- Proceedings of the Ninth Conference on Machine Translation
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
- Venue:
- WMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 567–569
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.wmt-1.43/
- DOI:
- 10.18653/v1/2024.wmt-1.43
- Cite (ACL):
- Edoardo Ferrante. 2024. A High-quality Seed Dataset for Italian Machine Translation. In Proceedings of the Ninth Conference on Machine Translation, pages 567–569, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- A High-quality Seed Dataset for Italian Machine Translation (Ferrante, WMT 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.wmt-1.43.pdf