Spicing up the information soup: machine translation and the internet
Steve McLaughlin | Ulrike Schwall
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
The Internet is rapidly changing the face of business and dramatically transforming people’s working and private lives. These developments present both a challenge and an opportunity to many technologies, one of the most important being Machine Translation. The Internet will soon be the most important medium for offering and finding information, and one of the principle means of communication for both companies and private users. There are many players on the Internet scene, each with different needs. Some players require help in presenting their information to an international audience, others require help in finding the information they seek and, because the Internet is increasingly multilingual, help in understanding that which they find. This paper attempts to identify the players and their needs, and outlines the products and services with which Machine Translation can help them to fully participate in the Internet revolution.
This paper describes the progress which has been made to make MT systems usable in professional environments. After many years of significant investment, it was decided that the time was ripe for the METAL machine translation system to be better positioned in the market place. Two lines of action were followed: Introducing the system onto the PC market, using the GMS-T1 as a concrete example; Reusing system components in customized solutions, using the AVENTINUS project as an example, which is a multilingual information processing application. Both lines of action have far-reaching consequences for system development. But they also create new opportunities to improve the system's capabilities and flexibility.
This paper deals with multiword lexemes (MWLs), focussing on two types of verbal MWLs: verbal idioms and support verb constructions. We discuss the characteristic properties of MWLs, namely non-standard compositionality, restricted substitutability of components, and restricted morpho-syntactic flexibility, and we show how these properties may cause serious problems during the analysis, generation, and transfer steps of machine translation systems. In order to cope with these problems, MT lexicons need to provide detailed descriptions of MWL properties. We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some feasibility studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries.
IBM is engaged in advanced research and development projects on various aspects of machine translation, between several language pairs. The activities reported on hero are all parts of a rather large-scale, international effort, following Michael McCord’s LMT approach. The paper focuses on seven selected topics: recent enhancements made in the Slot Grammar formalism and the specific analysis components; specification of a semantic type hierarchy and its use for verb sense disambiguation; incorporation of statistical techniques in the translation process; anaphora resolution; linkage of target morphology modules; methods for the construction of large MT lexicons; and interactive disambiguation.