Tokengram_F, a Fast and Accurate Token-based chrF++ Derivative

Sören Dreano, Derek Molloy, Noel Murphy


Abstract
Tokengram_F is an F-score-based evaluation metric for Machine Translation that is heavily in- spired by chrF++ and can act as a more accurate replacement. By replacing word n-grams with n-grams obtained from tokenization algorithms, tokengram_F better captures similarities between words.
Anthology ID:
2023.wmt-1.59
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
730–737
Language:
URL:
https://aclanthology.org/2023.wmt-1.59
DOI:
10.18653/v1/2023.wmt-1.59
Bibkey:
Cite (ACL):
Sören Dreano, Derek Molloy, and Noel Murphy. 2023. Tokengram_F, a Fast and Accurate Token-based chrF++ Derivative. In Proceedings of the Eighth Conference on Machine Translation, pages 730–737, Singapore. Association for Computational Linguistics.
Cite (Informal):
Tokengram_F, a Fast and Accurate Token-based chrF++ Derivative (Dreano et al., WMT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.wmt-1.59.pdf