Abstract
Building on a recent method for decoding translation candidates from a Machine Translation (MT) model via a genetic algorithm, we modify it to generate adversarial translations to test and challenge MT evaluation metrics. The produced translations score very well in an arbitrary MT evaluation metric selected beforehand, despite containing serious, deliberately introduced errors. The method can be used to create adversarial test sets to analyze the biases and shortcomings of the metrics. We publish various such test sets for the Czech to English language pair, as well as the code to convert any parallel data into a similar adversarial test set.- Anthology ID:
- 2024.lrec-main.668
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 7562–7569
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.668
- DOI:
- Cite (ACL):
- Josef Jon and Ondřej Bojar. 2024. GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7562–7569, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation (Jon & Bojar, LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.668.pdf