A comparison of translation performance between DeepL and Supertext

Alex Flückiger; Chantal Amrhein; Tim Graf; Frédéric Odermatt; Martin Pömsl; Philippe Schläpfer; Florian Schottmann; Samuel Läubli

A comparison of translation performance between DeepL and Supertext

Alex Flückiger, Chantal Amrhein, Tim Graf, Frédéric Odermatt, Martin Pömsl, Philippe Schläpfer, Florian Schottmann, Samuel Läubli

Abstract

As strong machine translation (MT) systems are increasingly based on large language models (LLMs), reliable quality benchmarking requires methods that capture their ability to leverage extended context. This study compares two commercial MT systems – DeepL and Supertext – by assessing their performance on unsegmented texts. We evaluate translation quality across four language directions with professional translators assessing segments with full document-level context. While segment-level assessments indicate no strong preference between the systems in most cases, document-level analysis reveals a preference for Supertext in three out of four language directions, suggesting superior consistency across longer texts. We advocate for more context-sensitive evaluation methodologies to ensure that MT quality assessments reflect real-world usability. We release all evaluation data and scripts for further analysis and reproduction at https://github.com/supertext/evaluation_deepl_supertext.

Anthology ID:: 2025.mtsummit-2.6
Volume:: Proceedings of Machine Translation Summit XX: Volume 2
Month:: June
Year:: 2025
Address:: Geneva, Switzerland
Editors:: Pierrette Bouillon, Johanna Gerlach, Sabrina Girletti, Lise Volkart, Raphael Rubino, Rico Sennrich, Samuel Läubli, Martin Volk, Miquel Esplà-Gomis, Vincent Vandeghinste, Helena Moniz, Sara Szoc
Venue:: MTSummit
SIG:
Publisher:: European Association for Machine Translation
Note:
Pages:: 52–57
Language:
URL:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-2.6/
DOI:
Bibkey:
Cite (ACL):: Alex Flückiger, Chantal Amrhein, Tim Graf, Frédéric Odermatt, Martin Pömsl, Philippe Schläpfer, Florian Schottmann, and Samuel Läubli. 2025. A comparison of translation performance between DeepL and Supertext. In Proceedings of Machine Translation Summit XX: Volume 2, pages 52–57, Geneva, Switzerland. European Association for Machine Translation.
Cite (Informal):: A comparison of translation performance between DeepL and Supertext (Flückiger et al., MTSummit 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-2.6.pdf

PDF Cite Search Fix data