A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments
Abstract
The session will provide an overview of some of the new Machine Translation metrics available on the market, analyze if and how these new metrics correlate at a segment level to the results of Adequacy and Fluency Human Assessments, and how they compare against TER scores and Levenshtein Distance – two of our currently preferred metrics – as well as against each of the other. The information in this session will help to get a better understanding of their strengths and weaknesses and make informed decisions when it comes to forecasting MT production.- Anthology ID:
- 2021.mtsummit-up.29
- Volume:
- Proceedings of Machine Translation Summit XVIII: Users and Providers Track
- Month:
- August
- Year:
- 2021
- Address:
- Virtual
- Venue:
- MTSummit
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- 440–465
- Language:
- URL:
- https://aclanthology.org/2021.mtsummit-up.29
- DOI:
- Cite (ACL):
- Mara Nunziatini and Andrea Alfieri. 2021. A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments. In Proceedings of Machine Translation Summit XVIII: Users and Providers Track, pages 440–465, Virtual. Association for Machine Translation in the Americas.
- Cite (Informal):
- A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments (Nunziatini & Alfieri, MTSummit 2021)