This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
LaurieGerber
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
We describe a case study that presents a framework for examining whether Machine Translation (MT) output enables translation professionals to translate faster while at the same time producing better quality translations than without MT output. We seek to find decision factors that enable a translation professional, known as a Paralinguist, to determine whether MT output is of sufficient quality to serve as a “seed translation” for post-editors. The decision factors, unlike MT developers’ automatic metrics, must function without a reference translation. We also examine the correlation of MT developers’ automatic metrics with error annotators’ assessments of post-edited translations.
In this paper, we describe the methods used to develop an exchangeable translation memory bank of sentence-aligned Mandarin Chinese - English sentences. This effort is part of a larger effort, initiated by the National Virtual Translation Center (NVTC), to foster collaboration and sharing of translation memory banks across the Intelligence Community and the Department of Defense. In this paper, we describe our corpus creation process - a largely automated process - highlighting the human interventions that are still deemed necessary. We conclude with a brief discussion of how this work will affect plans for NVTC's new translation management workflow and future research to increase the performance of the automated components of the corpus creation process.
In this short paper, I explore ways in which the MT community might formulate goals that will expand on known successes, build on existing strengths, and identify long term research goals.
Translation systems tend to have more trouble with long sentences than with short ones for a variety of reasons. When the source and target languages differ rather markedly, as do Japanese and English, this problem is reflected in lower quality output. To improve readability, we experimented with automatically splitting long sentences into shorter ones. This paper outlines the problem, describes the sentence splitting procedure and rules, and provides an evaluation of the results.
MT research in the commercial environment tends to be conservative, and to introduce change gradually, both because of limited funds, and the need to quickly turn innovations into product features. However, there are a number of challenges and opportunities that could make commercial research a much more dynamic environment for advancement of the field as a whole.
YSTRAN has demonstrated success in the MT field with its long history spanning nearly 30 years. As a general-purpose fully automatic MT system, SYSTRAN employs a transfer approach. Among its several components, large, carefully encoded, high-quality dictionaries are critical to SYSTRAN's translation capability. A total of over 2.4 million words and expressions are now encoded in the dictionaries for twelve source language systems (30 language pairs - one per year!). SYSTRAN'S dictionaries, along with its parsers, transfer modules, and generators, have been tested on huge amounts of text, and contain large terminology databases covering various domains and detailed linguistic rules. Using these resources, SYSTRAN MT systems have successfully served practical translation needs for nearly 30 years, and built a reputation in the MT world for their large, mature dictionaries. This paper describes various aspects of SYSTRAN MT dictionary development as an important part of the development and refinement of SYSTRAN MT systems. There are 4 major sections: 1) Role and Importance of Dictionaries in the SYSTRAN Paradigm describes the importance of coverage and depth in the dictionaries; 2) Dictionary Structure discusses the specifics of dictionary structure and types of information represented; 3) Dictionary Creation and Update describes the strategy and mechanics of the dictionary development; 4) Past. Present and Future Development provides some perspective on where SYSTRAN has come from and where it is going.