Dragoș Ciobanu

Also published as: Dragos Ciobanu, Dragoş Ciobanu

2024

pdf abs
Bayesian Hierarchical Modelling for Analysing the Effect of Speech Synthesis on Post-Editing Machine Translation
Miguel Rios | Justus Brockmann | Claudia Wiesinger | Raluca Chereji | Alina Secară | Dragoș Ciobanu
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

Automatic speech synthesis has seen rapid development and integration in domains as diverse as accessibility services, translation, or language learning platforms. We analyse its integration in a post-editing machine translation (PEMT) environment and the effect this has on quality, productivity, and cognitive effort. We use Bayesian hierarchical modelling to analyse eye-tracking, time-tracking, and error annotation data resulting from an experiment involving 21 professional translators post-editing from English into German in a customised cloud-based CAT environment and listening to the source and/or target texts via speech synthesis. Using speech synthesis in a PEMT task has a non-substantial positive effect on quality, a substantial negative effect on productivity, and a substantial negative effect on the cognitive effort expended on the target text, signifying that participants need to allocate less cognitive effort to the target text.

LT-LiDER is an Erasmus+ cooperation project with two main aims. The first is to map the landscape of technological capabilities required to work as a language and/or translation expert in the digitalised and datafied language industry. The second is to generate training outputs that will help language and translation trainers improve their skills and adopt appropriate pedagogical approaches and strategies for integrating data-driven technology into their language or translation classrooms, with a focus on digital and AI literacy.

2023

pdf abs
Quality Analysis of Multilingual Neural Machine Translation Systems and Reference Test Translations for the English-Romanian language pair in the Medical Domain
Miguel Angel Rios Gaona | Raluca-Maria Chereji | Alina Secara | Dragos Ciobanu
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

Multilingual Neural Machine Translation (MNMT) models allow to translate across multiple languages based on only one system. We study the quality of a domain-adapted MNMT model in the medical domain for English-Romanian with automatic metrics and a human error typology annotation based on the Multidimensional Quality Metrics (MQM). We further expand the MQM typology to include terminology-specific error categories. We compare the out-of-domain MNMT with the in-domain adapted MNMT on a standard test dataset of abstracts from medical publications. The in-domain MNMT model outperforms the out-of-domain MNMT in all measured automatic metrics and produces fewer errors. In addition, we perform the manual annotation over the reference test dataset to study the quality of the reference translations. We identify a high number of omissions, additions, and mistranslations in the reference dataset, and comment on the assumed accuracy of existing datasets. Finally, we compare the correlation between the COMET, BERTScore, and chrF automatic metrics with the MQM annotated translations. COMET shows a better correlation with the MQM scores compared to the other metrics.

2022

pdf abs
Error Annotation in Post-Editing Machine Translation: Investigating the Impact of Text-to-Speech Technology
Justus Brockmann | Claudia Wiesinger | Dragoș Ciobanu
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

As post-editing of machine translation (PEMT) is becoming one of the most dominant services offered by the language services industry (LSI), efforts are being made to support provision of this service with additional technology. We present text-to-speech (T2S) as a potential attention-raising technology for post-editors. Our study was conducted with university students and included both PEMT and error annotation of a creative text with and without T2S. Focusing on the error annotation data, our analysis finds that participants under-annotated fewer MT errors in the T2S condition compared to the silent condition. At the same time, more over-annotation was recorded. Finally, annotation performance corresponds to participants’ attitudes towards using T2S.

2006

pdf abs
Using Richly Annotated Trilingual Language Resources for Acquiring Reading Skills in a Foreign Language
Dragoş Ciobanu | Tony Hartley | Serge Sharoff
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In an age when demand for innovative and motivating language teaching methodologies is at a very high level, TREAT - the Trilingual REAding Tutor - combines the most advanced natural language processing (NLP) techniques with the latest second and third language acquisition (SLA/TLA) research in an intuitive and user-friendly environment that has been proven to help adult learners (native speakers of L1) acquire reading skills in an unknown L3 which is related to (cognate with) an L2 they know to some extent. This corpus-based methodology relies on existing linguistic resources, as well as materials that are easy to assemble, and can be adapted to support other pairs of L2-L3 related languages, as well. A small evaluation study conducted at the Leeds University Centre for Translation Studies indicates that, when using TREAT, learners feel more motivated to study an unknown L3, acquire significant linguistic knowledge of both the L3 and L2 rapidly, and increase their performance when translating from L3 into L1.

Co-authors

Venues

eamt4
lrec1