Maarit Koponen


2022

pdf
DiHuTra: a Parallel Corpus to Analyse Differences between Human Translations
Ekaterina Lapshinova-Koltunski | Maja Popović | Maarit Koponen
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper describes a new corpus of human translations which contains both professional and students translations. The data consists of English sources – texts from news and reviews – and their translations into Russian and Croatian, as well as of the subcorpus containing translations of the review texts into Finnish. All target languages represent mid-resourced and less or mid-investigated ones. The corpus will be valuable for studying variation in translation as it allows a direct comparison between human translations of the same source texts. The corpus will also be a valuable resource for evaluating machine translation systems. We believe that this resource will facilitate understanding and improvement of the quality issues in both human and machine translation. In the paper, we describe how the data was collected, provide information on translator groups and summarise the differences between the human translations at hand based on our preliminary results with shallow features.

pdf bib
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Helena Moniz | Lieve Macken | Andrew Rufener | Loïc Barrault | Marta R. Costa-jussà | Christophe Declercq | Maarit Koponen | Ellie Kemp | Spyridon Pilos | Mikel L. Forcada | Carolina Scarton | Joachim Van den Bogaert | Joke Daems | Arda Tezcan | Bram Vanroy | Margot Fonteyne
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

pdf
DiHuTra: a Parallel Corpus to Analyse Differences between Human Translations
Ekaterina Lapshinova-Koltunski | Maja Popović | Maarit Koponen
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This project aimed to design a corpus of parallel human translations (HTs) of the same source texts by professionals and students. The resulting corpus consists of English news and reviews source texts, their translations into Russian and Croatian, and translations of the reviews into Finnish. The corpus will be valuable for both studying variation in translation and evaluating machine translation (MT) systems.

pdf
LITHME: Language in the Human-Machine Era
Maarit Koponen | Kais Allkivi-Metsoja | Antonio Pareja-Lora | Dave Sayers | Márta Seresi
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The LITHME COST Action brings together researchers from various fields of study focusing on language and technology. We present the overall goals of LITHME and the network’s working groups focusing on diverse questions related to language and technology. As an example of the work of the LITHME network, we discuss the working group on language work and language professionals.

2020

pdf
MT for Subtitling: Investigating professional translators’ user experience and feedback
Maarit Koponen | Umut Sulubacak | Kaisa Vitikainen | Jörg Tiedemann
Proceedings of 1st Workshop on Post-Editing in Modern-Day Translation

pdf
MT for subtitling: User evaluation of post-editing productivity
Maarit Koponen | Umut Sulubacak | Kaisa Vitikainen | Jörg Tiedemann
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

This paper presents a user evaluation of machine translation and post-editing for TV subtitles. Based on a process study where 12 professional subtitlers translated and post-edited subtitles, we compare effort in terms of task time and number of keystrokes. We also discuss examples of specific subtitling features like condensation, and how these features may have affected the post-editing results. In addition to overall MT quality, segmentation and timing of the subtitles are found to be important issues to be addressed in future work.

2018

pdf
The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English
Franck Burlot | Yves Scherrer | Vinit Ravishankar | Ondřej Bojar | Stig-Arne Grönroos | Maarit Koponen | Tommi Nieminen | François Yvon
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

Progress in the quality of machine translation output calls for new automatic evaluation procedures and metrics. In this paper, we extend the Morpheval protocol introduced by Burlot and Yvon (2017) for the English-to-Czech and English-to-Latvian translation directions to three additional language pairs, and report its use to analyze the results of WMT 2018’s participants for these language pairs. Considering additional, typologically varied source and target languages also enables us to draw some generalizations regarding this morphology-oriented evaluation procedure.

2015

pdf bib
How to teach machine translation post-editing? Experiences from a post-editing course
Maarit Koponen
Proceedings of the 4th Workshop on Post-editing Technology and Practice

2013

pdf bib
This translation is not too bad: an analysis of post-editor choices in a machine-translation post-editing task
Maarit Koponen
Proceedings of the 2nd Workshop on Post-editing Technology and Practice

2012

pdf
Comparing human perceptions of post-editing effort with post-editing operations
Maarit Koponen
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
Post-editing time as a measure of cognitive effort
Maarit Koponen | Wilker Aziz | Luciana Ramos | Lucia Specia
Workshop on Post-Editing Technology and Practice

Post-editing machine translations has been attracting increasing attention both as a common practice within the translation industry and as a way to evaluate Machine Translation (MT) quality via edit distance metrics between the MT and its post-edited version. Commonly used metrics such as HTER are limited in that they cannot fully capture the effort required for post-editing. Particularly, the cognitive effort required may vary for different types of errors and may also depend on the context. We suggest post-editing time as a way to assess some of the cognitive effort involved in post-editing. This paper presents two experiments investigating the connection between post-editing time and cognitive effort. First, we examine whether sentences with long and short post-editing times involve edits of different levels of difficulty. Second, we study the variability in post-editing time and other statistics among editors.