ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task

Rachel Bawden, Biao Zhang, Andre Tättar, Matt Post


Abstract
We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a multilingual neural machine translation system. We build on recent work studying how to improve BLEU by using diverse automatically paraphrased references (Bawden et al., 2020), extending experiments to the multilingual setting for the WMT2020 metrics shared task and for three base metrics. We compare their capacity to exploit up to 100 additional synthetic references. We find that gains are possible when using additional, automatically paraphrased references, although they are not systematic. However, segment-level correlations, particularly into English, are improved for all three metrics and even with higher numbers of paraphrased references.
Anthology ID:
2020.wmt-1.98
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
887–894
Language:
URL:
https://aclanthology.org/2020.wmt-1.98
DOI:
Bibkey:
Cite (ACL):
Rachel Bawden, Biao Zhang, Andre Tättar, and Matt Post. 2020. ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task. In Proceedings of the Fifth Conference on Machine Translation, pages 887–894, Online. Association for Computational Linguistics.
Cite (Informal):
ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task (Bawden et al., WMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2020.wmt-1.98.pdf
Video:
 https://slideslive.com/38939673