Proceedings of Machine Translation Summit VI: Papers

Virginia Teller, Beth Sundheim (Editors)

Anthology ID:
October 29 – November 1
San Diego, California
Bib Export formats:

pdf bib
Machine Translation of Interactive Texts
Mary Flanagan

pdf bib
A Real-Time MT System for Translating Broadcast Captions
Eric Nyberg | Teruko Mitamura

This presentation demonstrates a new multi-engine machine translation system, which combines knowledge-based and example-based machine translation strategies for real-time translation of business news captions from English to German.

pdf bib
Managing Distributed MT Projects Today — A New Challenge
Jennifer A. Brundage | Susan McCormick | Chris Pyne

The current trend towards globalization means that even the most modern of industries must constantly re-evaluate its strategies and adapt to new technologies. As a long-time supporter of MT technology, SAP has shown that it can make productive use of competitive, commercial MT products along with other CAT products. In making MT work for them, however, SAP has also had to substantially adapt the products that they received from MT companies. The result, after many years, is a full range of peripheral tools and workflow scenarios that support the use of their MT programs.

MT Research and Development (R&D) in Europe
Roger Havenith

Exchange Interfaces for Translation Tools
Gregor Thurmair

The following paper presents an overview of current discussions of exchange interfaces in the area of multilingual processing. It first discusses the principles which are relevant for the definition of such interfaces; it then presents a state of the art and a proposal in the area of text interfaces, translation memory interfaces, and terminology exchange. The approach is bottom-up, i.e. it starts from existing interfaces and existing requirements, and intends to be of practical use. It reflects the discussions in current multilingual research projects of the EC, like OTELO and AVENTINUS.

R&D for Commercial MT
Laurie Gerber

MT research in the commercial environment tends to be conservative, and to introduce change gradually, both because of limited funds, and the need to quickly turn innovations into product features. However, there are a number of challenges and opportunities that could make commercial research a much more dynamic environment for advancement of the field as a whole.

MT from an Everyday User’s Point of View
Annelise Bech

This paper discusses the experiences of the specialised Danish translation company Lingtech in its use of MT for the translation of technical texts. The background and motivation for setting up Lingtech as an MT-based company is outlined. After a short general presentation of the PaTrans MT-system, the different tasks we have to perform in relation to our use of MT and the way this work is organized in order to achieve maximum cost-efficiency are described. This leads on to the discussion of problem areas for the everyday user in terms of ergonomy and tools for what may be called 'peripheral' tasks, e.g. pre- and post-editing texts, and dictionary maintenance. In the course of gaining experience in running an MT-based organization, we have identified crucial areas, where even relatively simple tools can have quite an impact on the overall productivity and profitability of using MT. Given the state-of-the-art within language technology many useful tools can now be made for the MT-user; however, we argue that too little attention has been given to these aspects so far and that they may indeed be critical to the commercial success of machine translation.

Translating Scientific Texts using MT and MAT Ssytems: Practical Experience of a Professional Translator
Olga Bezhanova

The paper describes practical experience of a professional translator. The task consisted in translating 400 pages of Russian scientific materials (covering all fundamental sciences) into English within a month. The job was fulfilled using three computer-based systems: PARS, a Russian-English bidirectional machine translation system by Lingvistica '93 Co., Polyglossum, dictionary-support software by ETS Ltd., and the Random House electronic dictionary of the English language. The paper analyzes the pluses and minuses of translating scientific texts using computer programs, and gives numerous examples of translations. The main conclusion is that machine translation has no reasonable alternative when a large volume of scientific texts is to be translated professionally within a short period of time.

User-Friendly Machine Translation: Alternate Translations Based on Differing Beliefs
David Farwell | Stephen Helmreich

In this paper the authors present a notion of “user-friendly” translation and describe a method for achieving it within a pragmatics-based approach to machine translation. The approach relies on modeling the beliefs of the participants in the translation process: the source language speaker and addressee, the translator and the target language addressee. Translation choices may vary according to how beliefs are ascribed to the various participants and, in particular, “user-friendly” choices are based on the beliefs ascribed to the TL addressee.

Sharable Formats and Their Supporting Environments for Exchanging User Dictionaries among Different MT Systems as a Part of AAMT Activities
Shin-ichiro Kamei | Etsuo Itoh | Mikiko Fujii | Tokuyuki Hirai | Yukari Saitoh | Masahito Takahashi | Tsutomu Hiyama | Kazunori Muraki

We, machine translation providers, as members of Asia-Pacific Association for Machine Translation (AAMT), are now establishing environments for sharing and exchanging user dictionaries among different machine translation systems. In order for users to utilize machine translation systems more effectively, we define common formats of user dictionaries, and establish electronic environments available for users to exchange their user dictionaries using these common formats. This task started in 1996, and the formats will be fixed in March of 1998.

JEIDA’s Bilingual Corpus and Other Corpora for NLP Research in Japan
Hitoshi Isahara

The committee on text processing technology of JEIDA (Japan Electronics Industry Development Association) has been developing its bilingual corpus for research on machine translation systems since the 1996 Japanese fiscal year. An overview of this bilingual corpus is presented in this paper. And other linguistic data recently developed in Japan, which includes the RWC text database and the simple sentence data by the CRL and IPA.

Multi-Lingual Spoken Dialog Translation System Using Transfer-Driven Machine Translation
Hidecki Mima | Osamu Furuse | Yumi Wakita | Hitoshi Iida

This paper describes a Transfer-Driven Machine Translation (TDMT) system as a prototype for efficient multi-lingual spoken-dialog translation. Currently, the TDMT system deals with dialogues in the travel domain, such as travel scheduling, hotel reservation, and trouble-shooting, and covers almost all expressions presented in commercially-available travel conversation guides. In addition, to put a speech dialog translation system into practical use, it is necessary to develop a mechanism that can handle the speech recognition errors. In TDMT, robust translation can be achieved by using an example-based correct parts extraction (CPE) technique to translate the plausible parts from speech recognition results even if the results have several recognition errors. We have applied TDMT to three language pairs, i.e., Japanese-English, Japanese-Korean, Japanese-German. Simulations of dialog communication between different language speakers can be provided via a TCP/IP network. In our performance evaluation for the translation of TDMT utilizing 69-87 unseen dialogs, we achieved about 70% acceptability in the JE, KJ translations, almost 60% acceptability in the EJ and JG translations, and about 90% acceptability in the JK translations. In the case of handling erroneous sentences caused by speech recognition errors, although almost all translation results end up as unacceptable translation in conventional methods, 69% of the speech translation results are improved by the CPE technique.

MT R&D in Asia
Hozumi Tanaka

There is a big shift in MT R&D in this region after many large-scale projects conducted in the past ten years. Multi-lingual Machine Translation (MMT) project is one of the significant R&D projects that increased a great number of NLP related researchers and research activities which can be seen in the increasing number of the research institutes in the recent years. We learned a lot from the collaboration research across languages and we still hope that it will be a rigorous step for the future MT R&D in this region. Though the MT systems are still far from the extreme goal of the perfect translation, it can be observed that the MT systems are actually used to support information retrieval from the Internet.

Corpus-Based Statistics-Oriented (CBSO) Machine Translation Researches in Taiwan
Jing-Shin Chang | Keh-Yih Su

A brief introduction to the MT research projects in Taiwan is given in this paper. Special attention is given to the more and more popular corpus-based statistics-oriented (CBSO) approaches in MT researches. In particular, the parameterized two-way training philosophy in designing the second generation BehaviorTran, which is the first and the largest operational system in this area, is introduced in this paper.

An Example of MT Use by the U.S.Government
Joel Ross

PARS/U for Windows: The World’s First Commercial English-Ukrainian and Ukrainian-English Machine Translation System
Michael S. Blekhman | Alla Rakova | Andrei Kursin

The paper describes the PARS/U Ukrainian-English bidirectional MT system by Lingvistica '93 Co. PARS/U translates MS Word and HTML files as well as screen Helps. It features an easy-to-master dictionary updating program, which permits the user to customize the system by means of running subject-area oriented texts through the MT engine. PARS/U is marketed in Ukraine and North America.

From METAL to T1: Systems and Components for Machine Translation Applications
Ulrike Schwall | Gregor Thurmair

This paper describes the progress which has been made to make MT systems usable in professional environments. After many years of significant investment, it was decided that the time was ripe for the METAL machine translation system to be better positioned in the market place. Two lines of action were followed: Introducing the system onto the PC market, using the GMS-T1 as a concrete example; Reusing system components in customized solutions, using the AVENTINUS project as an example, which is a multilingual information processing application. Both lines of action have far-reaching consequences for system development. But they also create new opportunities to improve the system's capabilities and flexibility.

MT R&D in Canada
Elliott Macklovitch

SYSTRAN MT Dictionary Development
Laurie Gerber | Jin Yang

YSTRAN has demonstrated success in the MT field with its long history spanning nearly 30 years. As a general-purpose fully automatic MT system, SYSTRAN employs a transfer approach. Among its several components, large, carefully encoded, high-quality dictionaries are critical to SYSTRAN's translation capability. A total of over 2.4 million words and expressions are now encoded in the dictionaries for twelve source language systems (30 language pairs - one per year!). SYSTRAN'S dictionaries, along with its parsers, transfer modules, and generators, have been tested on huge amounts of text, and contain large terminology databases covering various domains and detailed linguistic rules. Using these resources, SYSTRAN MT systems have successfully served practical translation needs for nearly 30 years, and built a reputation in the MT world for their large, mature dictionaries. This paper describes various aspects of SYSTRAN MT dictionary development as an important part of the development and refinement of SYSTRAN MT systems. There are 4 major sections: 1) Role and Importance of Dictionaries in the SYSTRAN Paradigm describes the importance of coverage and depth in the dictionaries; 2) Dictionary Structure discusses the specifics of dictionary structure and types of information represented; 3) Dictionary Creation and Update describes the strategy and mechanics of the dictionary development; 4) Past. Present and Future Development provides some perspective on where SYSTRAN has come from and where it is going.

MT as a Commercial Service: Three Case Studies
Terence Lewis

This paper presents three cases studies showing the considerably different uses customers make of our Dutch-English MT service.

Java and Its Role in Natural Language Processing and Machine Translation
Tim Read | Elena Bárcena | Pamela Faber

The Java programming language started as the language Oak when the World Wide Web was still being developed at CERN. It has gained popularity since its launch as a programming language capable of being used to develop applications which can run across the Internet (as well as local stand-alone programs). As with many technologies associated with the World Wide Web, there is a lot of 'hype', confusion, and misinformation. Consequently, while many researchers in the area of Natural Language Processing and Machine Translation will have heard of Java, may be considering using it, or even have got as far as their first 'Hello World' applet, they are probably not fully aware of what the implications of using this language are, and what possible role it could have in the development of computational linguistic applications, either intended to run locally on a wide range of computing platforms, or remotely across the Internet. This paper sets out to address this issue by presenting Java in a clear, concise fashion and considering how it may be used in computational linguistic applications. A requirements analysis for a generic Natural Language Processing and Machine Translation tool is undertaken to consider how Java could be used, and subsequently two example systems developed in Java (which can be accessed on the Internet) are introduced. Finally, pointers to Java resources are presented so that researchers interested in using this language can both install it and learn how to program it.

End-to-End Evaluation in VERBMOBIL I
Rita Nübel

VERBMOBIL is a speech-to-speech translation system for spoken dialogues between two speakers. The application scenario is appointment scheduling for business meetings, with spoken dialogues between two speakers. Both dialogue participants have at least a passive knowledge of English which serves as intermediate language1. The transfer directions are German to English and Japanese to English. A special feature of VERBMOBIL is that translations are produced on demand when the dialogue participants are unable to express themselves in English and therefore prefer to use their mother tongue. In this paper2 we present the criteria and the evaluation procedure for evaluating the translation quality of the VERBMOBIL prototype. The evaluated data have been produced by three concurrent processing methods that are integrated in the VERBMOBIL prototype. These processing methods differ with respect to processing depth, processing speed and translation quality ([2], p. 2). The paper is structured as follows: we start by giving a short description of the VERBMOBIL architecture focusing on the concurrent linguistic analyses and transfer processes which lead to three alternative translation outputs for each turn3. In section two we outline the evaluation procedure and criteria. The third section discusses the evaluation results, and the conclusion of the paper gives an outlook to future applications of automated evaluation procedures for machine translation (MT) based on an MT architecture where several concurrent translation approaches are integrated.

Using MT in a Corporate Setting
Lou Cremers

Using MT in a Corporate Setting
Christine Kamprath