Abstract
We propose eBLEU, a metric inspired by BLEU metric that uses embedding similarities instead of string matches. We introduce meaning diffusion vectors to enable matching n-grams of semantically similar words in a BLEU-like algorithm, using efficient, non-contextual word embeddings like fastText. On WMT23 data, eBLEU beats BLEU and ChrF by around 3.8% system-level score, approaching BERTScore at −0.9% absolute difference. In WMT22 scenarios, eBLEU outperforms f101spBLEU and ChrF in MQM by 2.2%−3.6%. Curiously, on MTurk evaluations, eBLEU surpasses past methods by 3.9%−8.2% (f200spBLEU, COMET-22). eBLEU presents an interesting middle-ground between traditional metrics and pretrained metrics.- Anthology ID:
- 2023.wmt-1.61
- Volume:
- Proceedings of the Eighth Conference on Machine Translation
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 746–750
- Language:
- URL:
- https://aclanthology.org/2023.wmt-1.61
- DOI:
- 10.18653/v1/2023.wmt-1.61
- Cite (ACL):
- Muhammad ElNokrashy and Tom Kocmi. 2023. eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings. In Proceedings of the Eighth Conference on Machine Translation, pages 746–750, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings (ElNokrashy & Kocmi, WMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.wmt-1.61.pdf