eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings

Muhammad ElNokrashy; Tom Kocmi

doi:10.18653/v1/2023.wmt-1.61

eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings

Abstract

We propose eBLEU, a metric inspired by BLEU metric that uses embedding similarities instead of string matches. We introduce meaning diffusion vectors to enable matching n-grams of semantically similar words in a BLEU-like algorithm, using efficient, non-contextual word embeddings like fastText. On WMT23 data, eBLEU beats BLEU and ChrF by around 3.8% system-level score, approaching BERTScore at −0.9% absolute difference. In WMT22 scenarios, eBLEU outperforms f101spBLEU and ChrF in MQM by 2.2%−3.6%. Curiously, on MTurk evaluations, eBLEU surpasses past methods by 3.9%−8.2% (f200spBLEU, COMET-22). eBLEU presents an interesting middle-ground between traditional metrics and pretrained metrics.

Anthology ID:: 2023.wmt-1.61
Volume:: Proceedings of the Eighth Conference on Machine Translation
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 746–750
Language:
URL:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2023.wmt-1.61/
DOI:: 10.18653/v1/2023.wmt-1.61
Bibkey:
Cite (ACL):: Muhammad ElNokrashy and Tom Kocmi. 2023. eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings. In Proceedings of the Eighth Conference on Machine Translation, pages 746–750, Singapore. Association for Computational Linguistics.
Cite (Informal):: eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings (ElNokrashy & Kocmi, WMT 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2023.wmt-1.61.pdf

PDF Cite Search Fix data