The State of the Machine Translation 2022

Konstantin Savenkov, Michel Lopez


Abstract
In this talk, we cover the 2022 annual report on State of the Machine Translation, prepared together by Intento and e2f. The report analyses the performance of 20+ commercial MT engines across 9 industries (General, Colloquial, Education, Entertainment, Financial, Healthcare, Hospitality, IT, and Legal) and 10+ key language pairs. For the first time, this report is run using a unique dataset covering all language/domain combinations above, prepared by e2f. The presentation would focus on the process of data selection and preparation, the report methodology, principal scores to rely on when studying MT outcomes (COMET, BERTScore, PRISM, TER, and hLEPOR), and the main report outcomes (best performing MT engines for every language / domain combination). It includes a thorough comparison of the scores. It also covers language support, prices, and other features of the MT engines.
Anthology ID:
2022.amta-upg.4
Volume:
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Editors:
Janice Campbell, Stephen Larocca, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
32–49
Language:
URL:
https://aclanthology.org/2022.amta-upg.4
DOI:
Bibkey:
Cite (ACL):
Konstantin Savenkov and Michel Lopez. 2022. The State of the Machine Translation 2022. In Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track), pages 32–49, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
The State of the Machine Translation 2022 (Savenkov & Lopez, AMTA 2022)
Copy Citation:
Presentation:
 2022.amta-upg.4.Presentation.pdf