GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation

Josef Jon, Ondřej Bojar


Abstract
Building on a recent method for decoding translation candidates from a Machine Translation (MT) model via a genetic algorithm, we modify it to generate adversarial translations to test and challenge MT evaluation metrics. The produced translations score very well in an arbitrary MT evaluation metric selected beforehand, despite containing serious, deliberately introduced errors. The method can be used to create adversarial test sets to analyze the biases and shortcomings of the metrics. We publish various such test sets for the Czech to English language pair, as well as the code to convert any parallel data into a similar adversarial test set.
Anthology ID:
2024.lrec-main.668
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
7562–7569
Language:
URL:
https://aclanthology.org/2024.lrec-main.668
DOI:
Bibkey:
Cite (ACL):
Josef Jon and Ondřej Bojar. 2024. GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7562–7569, Torino, Italia. ELRA and ICCL.
Cite (Informal):
GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation (Jon & Bojar, LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.668.pdf