Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track

Masaru Yamada, Felix do Carmo (Editors)


Anthology ID:
2023.mtsummit-users
Month:
September
Year:
2023
Address:
Macau SAR, China
Venue:
MTSummit
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
URL:
https://aclanthology.org/2023.mtsummit-users
DOI:
Bib Export formats:
BibTeX
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2023.mtsummit-users.pdf

pdf bib
Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track
Masaru Yamada | Felix do Carmo

pdf bib
Exploring undergraduate translation students’ perceptions towards machine translation: A qualitative questionnaire survey
Jia Zhang

Machine translation (MT) has relatively recently been introduced in higher education institutions, with specialised courses provided for students. However, such courses are often offered at the postgraduate level or towards the last year of an undergraduate programme (e.g., Arenas & Moorkens, 2019; Doherty et al., 2012). Most previous studies have focussed on postgraduate students or undergraduate students in the last year of their programme and surveyed their perceptions or attitudes towards MT with quantitative questionnaires (e.g., Liu et al., 2022; Yang et al., 2021), yet undergraduate students earlier in their translation education remain overlooked. As such, not much is known about how they perceive and use MT and what their training needs may be. This study investigates the perceptions towards MT of undergraduate students at the early stage of translator training via qualitative questionnaires. Year-two translation students with little or no MT knowledge and no real-life translation experience (n=20) were asked to fill out a questionnaire with open-ended questions. Their answers were manually analysed by the researcher using NVivo to identify themes and arguments. It was revealed that even without proper training, the participants recognised MT’s potential advantages and disadvantages to a certain degree. MT is more often engaged as an instrument to learn language and translation rather than straightforwardly a translation tool. None of the students reported post-editing machine-generated translation in their translation assignments. Instead, they referenced MT output to understand terms, slang, fixed combinations and complicated sentences and to produce accurate, authentic and diversified phrases and sentences. They held a positive attitude towards MT quality and agreed that MT increased their translation quality, and they felt more confident with the tasks. While they were willing to experiment with MT as a translation tool and perform post-editing in future tasks, they were doubtful that MT could be introduced in the classroom at their current stage of translation learning. They feared that MT would impact their independent and critical thinking. Students did not mention any potential negative impacts of MT on the development of their language proficiency or translation competency. It is hoped that the findings will make an evidence-based contribution to the design of MT curricula and teaching pedagogies. Keywords: machine translation, post-editing, translator training, perception, attitudes, teaching pedagogy References: Arenas, A. G., & Moorkens, J. (2019). Machine translation and post-editing training as part of a master’s programme. Journal of Specialised Translation, 31, 217–238. Doherty, S., Kenny, D., & Way, A. (2012). Taking statistical machine translation to the student translator. Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program. Liu, K., Kwok, H. L., Liu, J., & Cheung, A. K. (2022). Sustainability and influence of machine translation: Perceptions and attitudes of translation instructors and learners in Hong Kong. Sustainability, 14(11), 6399. Yang, Y., Wang, X., & Yuan, Q. (2021). Measuring the usability of machine translation in the classroom context. Translation and Interpreting Studies, 16(1), 101–123.

pdf bib
MT and legal translation: applications in training
Suzana Cunha

This paper investigates the introduction of machine translation (MT) in the legal translation class by means of a pilot study conducted with two groups of students. Both groups took courses in legal translation, but only one was familiarised with post-editing (PE). The groups post-edited an extract of a Portuguese company formation document, translated by an open-access neural machine translation (NMT) system and, subsequently, reflected on the assigned task. Although the scope of the study was limited, it was sufficient to confirm that prior ex-posure to machine translation post-editing (MTPE) did not significantly alter both groups’ editing operations. The pilot study is part of a broader investigation into how technology affects the decision-making process of trainee legal translators, and its results contributed to fine-tuning a meth-odological tool that aims to integrate MTPE procedures in an existing process-oriented legal translation approach developed by Prieto Ramos (2014). The study was repeated this year. This time both groups of trainees were introduced to and used the tool in class. A comparison of both studies’ results is expected to provide insight onto the productive use of MTPE in other domain-specific texts.

pdf
Technology Preparedness and Translator Training: Implications for Pedagogy
Hari Venkatesan

With increasing acknowledgement of enhanced quality now achievable by Machine Translation, new possibilities have emerged in translation, both vis-à-vis division of labour between human and machine in the translation process and acceptability of lower quality of language in exchange for efficiency. This paper presents surveys of four cohorts of post-graduate students of translation from the University of Macau to see if perceived trainee awareness and preparedness has kept pace with these possibilities. It is found that trainees across the years generally lack confidence in their perceived awareness, are hesitant in employing MT, and show definite reservations when reconsidering issues such as quality and division of labour. While the size of respondents is small, it is interesting to note that the awareness and preparedness mentioned above are found to be similar across the four years. The implication for training is that technology be fully integrated into the translation process in order to provide trainees with a template/framework to handle diverse situations, particularly those that require offering translations of a lower quality with a short turnaround time. The focus here is on Chinese-English translation, but the discussion may find resonance with other language pairs. Keywords Translator training, Computer-Assisted Translation, Machine Translation, translation pedagogy, Chinese-English translation

pdf
Reception of machine-translated and human-translated subtitles – A case study
Frederike Schierl

Accessibility and inclusion have become key terms of the last decades, and this does not exclude linguistics. Machine-translated subtitling has become the new approach to over-come linguistic accessibility barriers since it has proven to be fast and thus cost-efficient for audiovisual media, as opposed to human translation, which is time-intensive and costly. Machine translation can be considered as a solution when a translation is urgently needed. Overall, studies researching benefits of subtitling yield different results, also always depending on the application context (see Chan et al. 2022, Hu et al. 2020). Still, the acceptance of machine-translated subtitles is limited (see Tuominen et al., 2023) and users are rather skeptical, especially regarding the quality of MT subtitles. In the presented project, I investigated the effects of machine-translated subtitling (raw machine translation) compared to human-translated subtitling on the consumer, presenting the results of a case study, knowing that HT as the gold standard for translation is more and more put into question and being aware of today’s convincing output of NMT. The presented study investigates the use of (machine-translated) subtitles by the average consumer due to the current strong societal interest. I base my research project on the 3 R concept, i.e. response, reaction, and repercussion (Gambier, 2009), in which participants were asked to watch two video presentations on educational topics, one in German and another in Finnish, subtitled either with machine translation or by a human translator, or in a mixed condition (machine-translated and human-translated). Subtitle languages are English, German, and Finnish. Afterwards, they were asked to respond to questions on the video content (information retrieval) and evaluate the subtitles based on the User Experience Questionnaire (Laugwitz et al., 2008) and NASA Task Load Index (NASA, 2006). The case study shows that information retrieval in the HT conditions is higher, except for the direction Finnish-German. However, users generally report a better user experience for all lan-guages, which indicates a higher immersion. Participants also report that long subtitles combined with a fast pace contribute to more stress and more distraction from the other visual elements. Generally, users recognise the potential of MT subtitles, but also state that a human-in-the-loop is still needed to ensure publishable quality. References: Chan, Win Shan, Jan-Louis Kruger, and Stephen Doherty. 2022. ‘An Investigation of Subtitles as Learning Support in University Education’. Journal of Specialised Translation, no. 38: 155–79. Gambier, Yves. 2009. ‘Challenges in Research on Audiovisual Translation.’ In Translation Research Projects 2, edited by Pym, Anthony and Alexander Perekrestenko, 17–25. Tarragona: Intercultural Studies Group. Hu, Ke, Sharon O’Brien, and Dorothy Kenny. 2020. ‘A Reception Study of Machine Translated Subtitles for MOOCs’. Perspectives 28 (4): 521–38. https://doi.org/10.1080/0907676X.2019.1595069. Laugwitz, Bettina, Theo Held, and Martin Schrepp. 2008. ‘Construction and Evaluation of a User Experience Questionnaire’. In Symposium of the Austrian HCI and Usability Engineering Group, edited by Andreas Holzinger, 63–76. Springer. NASA. 2006. ‘NASA TLX: Task Load Index’. Tuominen, Tiina, Maarit Koponen, Kaisa Vitikainen, Umut Sulubacak, and Jörg Tiedemann. 2023. ‘Exploring the Gaps in Linguistic Accessibility of Media: The Potential of Automated Subtitling as a Solution’. Journal of Specialised Translation, no. 39: 77–89.

pdf
Machine Translation Implementation in Automatic Subtitling from a Subtitlers’ Perspective
Bina Xie

In recent years, automatic subtitling has gained considerable scholarly attention. Implementing machine translation in subtitling editors faces challenges, being a primary process in automatic subtitling. Therefore, there is still a significant research gap when it comes to machine translation implementation in automatic subtitling. This project compared different levels of non-verbal input videos from English to Chinese Simplified to examine post-editing efforts in automatic subtitling. The research collected the following data: process logs, which records the total time spent on the subtitles, keystrokes, and user experience questionnaire (UEQ). 12 subtitlers from a translation agency in Mainland China were invited to complete the task. The results show that there are no significant differences between videos with low and high levels of non-verbal input in terms of time spent. Furthermore, the subtitlers spent more effort on revising spotting and segmentation than translation when they post-edited texts with a high level of non-verbal input. While a majority of subtitlers show a positive attitude towards the application of machine translation, their apprehension lies in the potential overreliance on its usage.

pdf
Improving Standard German Captioning of Spoken Swiss German: Evaluating Multilingual Pre-trained Models
Jonathan David Mutal | Pierrette Bouillon | Johanna Gerlach | Marianne Starlander

Multilingual pre-trained language models are often the best alternative in low-resource settings. In the context of a cascade architecture for automatic Standard German captioning of spoken Swiss German, we evaluate different models on the task of transforming normalised Swiss German ASR output into Standard German. Instead of training a large model from scratch, we fine-tuned publicly available pre-trained models, which reduces the cost of training high-quality neural machine translation models. Results show that pre-trained multilingual models achieve the highest scores, and that a higher number of languages included in pre-training improves the performance. We also observed that the type of source and target included in fine-tuning data impacts the results.

pdf
Leveraging Multilingual Knowledge Graph to Boost Domain-specific Entity Translation of ChatGPT
Min Zhang | Limin Liu | Zhao Yanqing | Xiaosong Qiao | Su Chang | Xiaofeng Zhao | Junhao Zhu | Ming Zhu | Song Peng | Yinglu Li | Yilun Liu | Wenbing Ma | Mengyao Piao | Shimin Tao | Hao Yang | Yanfei Jiang

Recently, ChatGPT has shown promising results for Machine Translation (MT) in general domains and is becoming a new paradigm for translation. In this paper, we focus on how to apply ChatGPT to domain-specific translation and propose to leverage Multilingual Knowledge Graph (MKG) to help ChatGPT improve the domain entity translation quality. To achieve this, we extract the bilingual entity pairs from MKG for the domain entities that are recognized from source sentences. We then introduce these pairs into translation prompts, instructing ChatGPT to use the correct translations of the domain entities. To evaluate the novel MKG method for ChatGPT, we conduct comparative experiments on three Chinese-English (zh-en) test datasets constructed from three specific domains, of which one domain is from biomedical science, and the other two are from the Information and Communications Technology (ICT) industry — Visible Light Communication (VLC) and wireless domains. Experimental results demonstrate that both the overall translation quality of ChatGPT (+6.21, +3.13 and +11.25 in BLEU scores) and the translation accuracy of domain entities (+43.2%, +30.2% and +37.9% absolute points) are significantly improved with MKG on the three test datasets.

pdf
Human-in-the-loop Machine Translation with Large Language Model
Xinyi Yang | Runzhe Zhan | Derek F. Wong | Junchao Wu | Lidia S. Chao

The large language model (LLM) has garnered significant attention due to its in-context learning mechanisms and emergent capabilities. The research community has conducted several pilot studies to apply LLMs to machine translation tasks and evaluate their performance from diverse perspectives. However, previous research has primarily focused on the LLM itself and has not explored human intervention in the inference process of LLM. The characteristics of LLM, such as in-context learning and prompt engineering, closely mirror human cognitive abilities in language tasks, offering an intuitive solution for human-in-the-loop generation. In this study, we propose a human-in-the-loop pipeline that guides LLMs to produce customized outputs with revision instructions. The pipeline initiates by prompting the LLM to produce a draft translation, followed by the utilization of automatic retrieval or human feedback as supervision signals to enhance the LLM’s translation through in-context learning. The human-machine interactions generated in this pipeline are also stored in an external database to expand the in-context retrieval database, enabling us to leverage human supervision in an offline setting. We evaluate the proposed pipeline using the GPT-3.5-turbo API on five domain-specific benchmarks for German-English translation. The results demonstrate the effectiveness of the pipeline in tailoring in-domain translations and improving translation performance compared to direct translation instructions. Additionally, we discuss the experimental results from the following perspectives: 1) the effectiveness of different in-context retrieval methods; 2) the construction of a retrieval database under low-resource scenarios; 3) the observed differences across selected domains; 4) the quantitative analysis of sentence-level and word-level statistics; and 5) the qualitative analysis of representative translation cases.

pdf
The impact of machine translation on the translation quality of undergraduate translation students
Jia Zhang | Hong Qian

Post-editing (PE) refers to checking, proofreading, and revising the translation output of any automated translation (Gouadec, 2007, p. 25). It is needed because the meaning of a text can yet be accurately and fluently conveyed by machine translation (MT). The importance of PE and, accordingly, PE training has been widely acknowledged, and specialised courses have recently been introduced across universities and other organisations worldwide. However, scant consideration is given to when PE skills should be introduced in translation training. PE courses are usually offered to advanced translation learners, i.e., those at the postgraduate level or in the last year of an undergraduate program. Also, existing empirical studies most often investigate the impact of MT on postgraduate students or undergraduate students in the last year of their study. This paper reports on a study that aims to determine the possible effects of MT and PE on the translation quality of students at the early stage of translator training, i.e., undergraduate translation students with only basic translation knowledge. Methodologically, an experiment was conducted to compare students’ (n=10) PEMT-based translations and from-scratch translations without the assistance of machine translation. Second-year students of an undergraduate translation programme were invited to translate two English texts with similar difficulties into Chinese. One of the texts was translated directly, while the other one was done with reference to machine-generated translation. Translation quality can be dynamic. When examined from different perspectives using different methods, the quality of a translation can vary. Several methods of translation quality assessment were adopted in this project, including rubrics-based scoring, error analysis and fixed-point translation analysis. It was found that the quality of students’ PE translations was compromised compared with that of from-scratch translations. In addition, errors were more homogenised in the PEMT-based translations. It is hoped that this study can shed some light on the role of PEMT in translator training and contribute to the curricula and course design of post-editing for translator education. Reference: Gouadec, D. (2007). Translation as a Profession. John Benjamins Publishing. Keywords: machine translation, post-editing, translator training, translation quality assessment, error analysis, undergraduate students

pdf
Leveraging Latent Topic Information to Improve Product Machine Translation
Bryan Zhang | Stephan Walter | Amita Misra | Liling Tan

Meeting the expectations of e-commerce customers involves offering a seamless online shopping experience in their preferred language. To achieve this, modern e-commerce platforms rely on machine translation systems to provide multilingual product information on a large scale. However, maintaining high-quality machine translation that can keep up with the ever-expanding volume of product data remains an open challenge for industrial machine translation systems. In this context, topical clustering emerges as a valuable approach, leveraging latent signals and interpretable textual patterns to potentially enhance translation quality and facilitate industry-scale translation data discovery. This paper proposes two innovative methods: topic-based data selection and topic-signal augmentation, both utilizing latent topic clusters to improve the quality of machine translation in e-commerce. Furthermore, we present a data discovery workflow that utilizes topic clusters to effectively manage the growing multilingual product catalogs, addressing the challenges posed by their expansion.

pdf
Translating Dislocations or Parentheticals : Investigating the Role of Prosodic Boundaries for Spoken Language Translation of French into English
Nicolas Ballier | Behnoosh Namdarzadeh | Maria Zimina | Jean-Baptiste Yunès

This paper examines some of the effects of prosodic boundaries on ASR outputs and Spoken Language Translations into English for two competing French structures (“c’est” dislocation vs. “c’est” parentheticals). One native speaker of French read 104 test sentences that were then submitted to two systems. We compared the outputs of two toolkits, SYSTRAN Pure Neural Server (SPNS9) (Crego et al., 2016) and Whisper. For SPNS9, we compared the translation of the text file used for the reading with the translation of the transcription generated through Vocapia ASR. We also tested the transcription engine for speech recognition uploading an MP3 file and used the same procedure for AI Whisper’s Web-scale Supervised Pretraining for Speech Recognition system (Radford et al., 2022). We reported WER for the transcription tasks and the BLEU scores for the different models. We evidenced the variability of the punctuation in the ASR outputs and discussed it in relation to the duration of the utterance. We discussed the effects of the prosodic boundaries. We described the status of the boundary in the speech-to-text systems, discussing the consequence for the neural machine translation of the rendering of the prosodic boundary by a comma, a full stop, or any other punctuation symbol. We used the reference transcript of the reading phase to compute the edit distance between the reference transcript and the ASR output. We also used textometric analyses with iTrameur (Fleury and Zimina, 2014) for insights into the errors that can be attributed to ASR or to Neural Machine translation.

pdf
Exploring Multilingual Pretrained Machine Translation Models for Interactive Translation
Angel Navarro | Francisco Casacuberta

Pre-trained large language models (LLM) constitute very important tools in many artificial intelligence applications. In this work, we explore the use of these models in interactive machine translation environments. In particular, we have chosen mBART (multilingual Bidirectional and Auto-Regressive Transformer) as one of these LLMs. The system enables users to refine the translation output interactively by providing feedback. The system utilizes a two-step process, where the NMT (Neural Machine Translation) model generates a preliminary translation in the first step, and the user performs one correction in the second step–repeating the process until the sentence is correctly translated. We assessed the performance of both mBART and the fine-tuned version by comparing them to a state-of-the-art machine translation model on a benchmark dataset regarding user effort, WSR (Word Stroke Ratio), and MAR (Mouse Action Ratio). The experimental results indicate that all the models performed comparably, suggesting that mBART is a viable option for an interactive machine translation environment, as it eliminates the need to train a model from scratch for this particular task. The implications of this finding extend to the development of new machine translation models for interactive environments, as it indicates that novel pre-trained models exhibit state-of-the-art performance in this domain, highlighting the potential benefits of adapting these models to specific needs.

pdf
Machine translation of Korean statutes examined from the perspective of quality and productivity
Jieun Lee | Hyoeun Choi

Because machine translation (MT) still falls short of human parity, human intervention is needed to ensure quality translation. The existing literature indicates that machine translation post-editing (MTPE) generally enhances translation productivity, but the question of quality remains for domain-specific texts (e.g. Aranberri et al., 2014; Jia et al., 2022; Kim et al., 2019; Lee, 2021a,b). Although legal translation is considered as one of the most complex specialist transla-tion domains, because of the demand surge for legal translation, MT has been utilized to some extent for documents of less importance (Roberts, 2022). Given that little research has examined the productivity and quality of MT and MTPE in Korean-English legal translation, we sought to examine the productivity and quality of MT and MTPE of Korean of statutes, using DeepL, a neural machine translation engine which has recently started the Korean language service. This paper presents the preliminary findings from a research project that investigated DeepL MT qua-lity and the quality and productivity of MTPE outputs and human translations by seven professional translators.

pdf
Fine-tuning MBART-50 with French and Farsi data to improve the translation of Farsi dislocations into English and French
Behnoosh Namdarzadeh | Sadaf Mohseni | Lichao Zhu | Guillaume Wisniewski | Nicolas Ballier

In this paper, we discuss the improvements brought by the fine-tuning of mBART50 for the translation of a specific Farsi dataset of dislocations. Given our BLEU scores, our evaluation is mostly qualitative: we assess the improvements of our fine-tuning in the translations into French of our test dataset of Farsi. We describe the fine-tuning procedure and discuss the quality of the results in the translations from Farsi. We assess the sentences in the French translations that contain English tokens and for the English translations, we examine the ability of the fine- tuned system to translate Farsi dislocations into English without replicating the dislocated item as a double subject. We scrutinized the Farsi training data used to train for mBART50 (Tang et al., 2021). We fine-tuned mBART50 with samples from an in-house French-Farsi aligned translation of a short story. In spite of the scarcity of available resources, we found that fine- tuning with aligned French-Farsi data dramatically improved the grammatical well-formedness of the predictions for French, even if serious semantic issues remained. We replicated the experiment with the English translation of the same Farsi short story for a Farsi-English fine-tuning and found out that similar semantic inadequacies cropped up, and that some translations were worse than our mBART50 baseline. We showcased the fine-tuning of mBART50 with supplementary data and discussed the asymmetry of the situation, adding little data in the fine-tuning is sufficient to improve morpho-syntax for one language pair but seems to degrade translation to English.

pdf
KG-IQES: An Interpretable Quality Estimation System for Machine Translation Based on Knowledge Graph
Junhao Zhu | Min Zhang | Hao Yang | Song Peng | Zhanglin Wu | Yanfei Jiang | Xijun Qiu | Weiqiang Pan | Ming Zhu | Ma Miaomiao | Weidong Zhang

The widespread use of machine translation (MT) has driven the need for effective automatic quality estimation (AQE) methods. How to enhance the interpretability of MT output quality estimation is well worth exploring in the industry. From the perspective of the alignment of named entities (NEs) in the source and translated sentences, we construct a multilingual knowledge graph (KG) consisting of domain-specific NEs, and design a KG-based interpretable quality estimation (QE) system for machine translations (KG-IQES). KG-IQES effectively estimates the translation quality without relying on reference translations. Its effectiveness has been verified in our business scenarios.

pdf
Enhancing Gender Representation in Neural Machine Translation: A Comparative Analysis of Annotating Strategies for English-Spanish and English-Polish Language Pairs
Celia Soler Uguet | Fred Bane | Mahmoud Aymo | João Pedro Fernandes Torres | Anna Zaretskaya | Tània Blanch Miró

Machine translation systems have been shown to demonstrate gender bias (Savoldi et al., 2021; Stafanovičs et al., 2020; Stanovsky et al., 2020), and contribute to this bias with systematically unfair translations. In this presentation, we explore a method of enforcing gender in NMT. We generalize the method proposed by Vincent et al. (2022) to create training data not requiring a first-person speaker. Drawing from other works that use special tokens to pass additional information to NMT systems (e.g. Ailem et al., 2021), we annotate the training data with special tokens to mark the gender of a given noun in the text, which enables the NMT system to produce the correct gender during translation. These tokens are also used to mark the gender in a source sentence at inference time. However, in production scenarios, gender is often unknown at inference time, so we propose two methods of leveraging language models to obtain these labels. Our experiment is set up in a fine-tuning scenario, adapting an existing translation model with gender-annotated data. We focus on the English to Spanish and Polish language pairs. Without guidance, NMT systems often ignore signals that indicate the correct gender for translation. To this end, we consider two methods of annotating the source English sentence for gender, such as the noun developer in the following sentence: The developer argued with the designer because she did not like the design. a) We use a coreference resolution model based on SpanBERT (Joshi et al., 2020) to connect any gender-indicating pronouns to their head nouns. b) We use the GPT-3.5 model prompted to identify the gender of each person in the sentence based on the context within the sentence. For test data, we use a collection of sentences from Stanovsky et al. including two professions and one pronoun that can refer only to one of them. We use the above two methods to annotate the source sentence we want to translate, produce the translations with our fine-tuned model and compare the accuracy of the gender translation in both cases. The correctness of the gender was evaluated by professional linguists. Overall, we observed a significant improvement in gender translations compared to the baseline (a 7% improvement for Spanish and a 50% improvement for Polish), with SpanBERT outperforming GPT on this task. The Polish MT model still struggles to produce the correct gender (even the translations produced with the ‘gold truth’ gender markings are only correct in 56% of the cases). We discuss limitations to this method. Our research is intended as a reference for fellow MT practitioners, as it offers a comparative analysis of two practical implementations that show the potential to enhance the accuracy of gender in translation, thereby elevating the overall quality of translation and mitigating gender bias.

pdf
Brand Consistency for Multilingual E-commerce Machine Translation
Bryan Zhang | Stephan Walter | Saurabh Chetan Birari | Ozlem Eren

In the realm of e-commerce, it is crucial to ensure consistent localization of brand terms in product information translations. With the ever-evolving e-commerce landscape, new brands and their localized versions are consistently emerging. However, these diverse brand forms and aliases present a significant challenge in machine translation (MT). This study investigates MT brand consistency problem in multilingual e-commerce and proposes practical and sustainable solutions to maintain brand consistency in various scenarios within the e-commerce industry. Through experimentation and analysis of an English-Arabic MT system, we demonstrate the effectiveness of our proposed solutions.

pdf
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution
Akshat Dewan | Michal Ziemski | Henri Meylan | Lorenzo Concina | Bruno Pouliquen

This paper presents an end-to-end solution for the creation of fully automated conference meeting transcripts and their machine translations into various languages. This tool has been developed at the World Intellectual Property Organization (WIPO) using in-house developed speech-to-text (S2T) and machine translation (MT) components. Beyond describing data collection and fine-tuning, resulting in a highly customized and robust system, this paper describes the architecture and evolution of the technical components as well as highlights the business impact and benefits from the user side. We also point out particular challenges in the evolution and adoption of the system and how the new approach created a new product and replaced existing established workflows in conference management documentation.

pdf
Optimizing Machine Translation through Prompt Engineering: An Investigation into ChatGPT’s Customizability
Masaru Yamada

This paper explores the influence of integrating the purpose of the translation and the target audience into prompts on the quality of translations produced by ChatGPT. Drawing on previous translation studies, industry practices, and ISO standards, the research underscores the significance of the pre-production phase in the translation process. The study reveals that the inclusion of suitable prompts in large-scale language models like ChatGPT can yield flexible translations, a feat yet to be realized by conventional Ma-chine Translation (MT). The research scrutinizes the changes in translation quality when prompts are used to generate translations that meet specific conditions. The evaluation is conducted from a practicing translator’s viewpoint, both subjectively and qualitatively, supplemented by the use of OpenAI’s word embedding API for cosine similarity calculations. The findings suggest that the integration of the purpose and target audience into prompts can indeed modify the generated translations, generally enhancing the translation quality by industry standards. The study also demonstrates the practical application of the “good translation” concept, particularly in the context of marketing documents and culturally dependent idioms.

pdf
Comparing Chinese-English MT Performance Involving ChatGPT and MT Providers and the Efficacy of AI mediated Post-Editing
Larry Cady | Benjamin Tsou | John Lee

The recent introduction of ChatGPT has caused much stir in the translation industry because of its impressive translation performance against leaders in the industry. We review some ma-jor issues based on the BLEU comparisons of Chinese-to-English (C2E) and English-to-Chinese (E2C) machine translation (MT) performance by ChatGPT against a range of leading MT providers in mostly technical domains. Based on sample aligned sentences from a sizable bilingual Chinese-English patent corpus and other sources, we find that while ChatGPT perform better generally, it does not consistently perform better than others in all areas or cases. We also draw on novice translators as post-editors to explore a major component in MT post-editing: Optimization of terminology. Many new technical words, including MWEs (Multi-Word Expressions), are problematic because they involve terminological developments which must balance between proper encapsulation of technical innovation and conforming to past traditions . Drawing on the above-mentioned corpus we have been developing an AI mediated MT post-editing (MTPE) system through the optimization of precedent rendition distribution and semantic association to enhance the work of translators and MTPE practitioners.

pdf
Challenges of Human vs Machine Translation of Emotion-Loaded Chinese Microblog Texts
Shenbin Qian | Constantin Orăsan | Félix do Carmo | Diptesh Kanojia

This paper attempts to identify challenges professional translators face when translating emotion-loaded texts as well as errors machine translation (MT) makes when translating this content. We invited ten Chinese-English translators to translate thirty posts of a Chinese microblog, and interviewed them about the challenges encountered during translation and the problems they believe MT might have. Further, we analysed more than five-thousand automatic translations of microblog posts to observe problems in MT outputs. We establish that the most challenging problem for human translators is emotion-carrying words, which translators also consider as a problem for MT. Analysis of MT outputs shows that this is also the most common source of MT errors. We also find that what is challenging for MT, such as non-standard writing, is not necessarily an issue for humans. Our work contributes to a better understanding of the challenges for the translation of microblog posts by humans and MT, caused by different forms of expression of emotion.