This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
MaraNunziatini
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This paper discusses the capabilities and benefits of OPAL Enable, an advanced AI suite designed to modernize localization processes. The suite comprises Machine Translation, AI Post-Editing, and AI Quality Estimation tools, integrated into renowned translation management systems. The paper provides an in-depth analysis of these features, detailing their procedural order, and the time and cost savings they offer. It emphasizes the customization potential of OPAL Enable to meet client-specific requirements, increase scalability, and expedite workflows.
This paper investigates the effectiveness of combining machine translation (MT) systems and large language models (LLMs) to produce gender-inclusive translations from English to Spanish. The study uses a multi-step approach where a translation is first generated by an MT engine and then reviewed by an LLM. The results suggest that while LLMs, particularly GPT-4, are successful in generating gender-inclusive post-edited translations and show potential in enhancing fluency, they often introduce unnecessary changes and inconsistencies. The findings underscore the continued necessity for human review in the translation process, highlighting the current limitations of AI systems in handling nuanced tasks like gender-inclusive translation. Also, the study highlights that while the combined approach can improve translation fluency, the effectiveness and reliability of the post-edited translations can vary based on the language of the prompts used.
Segment-level Quality Estimation (QE) is an increasingly sought-after task in the Machine Translation (MT) industry. In recent years, it has experienced an impressive evolution not only thanks to the implementation of supervised models using source and hypothesis information, but also through the usage of MT probabilities. This work presents a different approach to QE where only the source segment and the Neural MT (NMT) training data are needed, making possible an approximation to translation quality before inference. Our work is based on the idea that NMT quality at a segment level depends on the similarity degree between the source segment to be translated and the engine’s training data. The features proposed measuring this aspect of data achieve competitive correlations with MT metrics and human judgment and prove to be advantageous for post-editing (PE) prioritization task with domain adapted engines.
The session will provide an overview of some of the new Machine Translation metrics available on the market, analyze if and how these new metrics correlate at a segment level to the results of Adequacy and Fluency Human Assessments, and how they compare against TER scores and Levenshtein Distance – two of our currently preferred metrics – as well as against each of the other. The information in this session will help to get a better understanding of their strengths and weaknesses and make informed decisions when it comes to forecasting MT production.
While definitions of full and light post-editing have been around for a while, and error typologies like DQF and MQM gained in prominence since the beginning of last decade, for a long time customers tended to refuse to be flexible as for their final quality requirements, irrespective of the text type, purpose, target audience etc. We are now finally seeing some change in this space, with a renewed interest in different machine translation (MT) and post-editing (PE) service levels. While existing definitions of light and full post-editing are useful as general guidelines, they typically remain too abstract and inflexible both for translation buyers and linguists. Besides, they are inconsistent and overlap across the literature and different Language Service Providers (LSPs). In this paper, we comment on existing industry standards and share our experience on several challenges, as well as ways to steer customer conversations and provide clear instructions to post-editors.