Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data
Zhongtao Liu, Parker Riley, Daniel Deutsch, Alison Lui, Mengmeng Niu, Apurva Shah, Markus Freitag
Abstract
Collecting high-quality translations is crucial for the development and evaluation of machine translation systems. However, traditional human-only approaches are costly and slow. This study presents a comprehensive investigation of 11 approaches for acquiring translation data, including human-only, machine-only, and hybrid approaches. Our findings demonstrate that human-machine collaboration can match or even exceed the quality of human-only translations, while being more cost-efficient. Error analysis reveals the complementary strengths between human and machine contributions, highlighting the effectiveness of collaborative methods. Cost analysis further demonstrates the economic benefits of human-machine collaboration methods, with some approaches achieving top-tier quality at around 60% of the cost of traditional methods. We release a publicly available dataset containing nearly 18,000 segments of varying translation quality with corresponding human ratings to facilitate future research.- Anthology ID:
- 2024.wmt-1.110
- Volume:
- Proceedings of the Ninth Conference on Machine Translation
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
- Venue:
- WMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1095–1106
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2024.wmt-1.110/
- DOI:
- 10.18653/v1/2024.wmt-1.110
- Cite (ACL):
- Zhongtao Liu, Parker Riley, Daniel Deutsch, Alison Lui, Mengmeng Niu, Apurva Shah, and Markus Freitag. 2024. Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data. In Proceedings of the Ninth Conference on Machine Translation, pages 1095–1106, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data (Liu et al., WMT 2024)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2024.wmt-1.110.pdf