eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings

Muhammad ElNokrashy, Tom Kocmi


Abstract
We propose eBLEU, a metric inspired by BLEU metric that uses embedding similarities instead of string matches. We introduce meaning diffusion vectors to enable matching n-grams of semantically similar words in a BLEU-like algorithm, using efficient, non-contextual word embeddings like fastText. On WMT23 data, eBLEU beats BLEU and ChrF by around 3.8% system-level score, approaching BERTScore at −0.9% absolute difference. In WMT22 scenarios, eBLEU outperforms f101spBLEU and ChrF in MQM by 2.2%−3.6%. Curiously, on MTurk evaluations, eBLEU surpasses past methods by 3.9%−8.2% (f200spBLEU, COMET-22). eBLEU presents an interesting middle-ground between traditional metrics and pretrained metrics.
Anthology ID:
2023.wmt-1.61
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
746–750
Language:
URL:
https://aclanthology.org/2023.wmt-1.61
DOI:
10.18653/v1/2023.wmt-1.61
Bibkey:
Cite (ACL):
Muhammad ElNokrashy and Tom Kocmi. 2023. eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings. In Proceedings of the Eighth Conference on Machine Translation, pages 746–750, Singapore. Association for Computational Linguistics.
Cite (Informal):
eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings (ElNokrashy & Kocmi, WMT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2023.wmt-1.61.pdf