This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
MikeDillinger
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This paper describes the facilities of Converser for Healthcare 4.0, a highly interactive speech translation system which enables users to verify and correct speech recognition and machine translation. Corrections are presently useful for real-time reliability, and in the future should prove applicable to offline machine learning. We provide examples of interactive tools in action, emphasizing semantically controlled back-translation and lexical disambiguation, and explain for the first time the techniques employed in the tools’ creation, focusing upon compilation of a database of semantic cues and its connection to third-party MT engines. Planned extensions of our techniques to statistical MT are also discussed.
This paper reports on three business opportunities encountered by Spoken Translation, Inc., a developer of software systems for automatic spoken translation: (1) a healthcare organization needing improved communications between limited-English patients and their caregivers; (2) a networking and communications firm aiming to add UN-style simultaneous interpreting to their telepresence facilities; and (3) the retail arm of a device manufacturer hoping to enable more effective in-store consulting for customers with imperfect command of an outlet's native language. None of these openings has yet led to substantial business, but one remains in negotiation. We describe how the business introductions came to us; the proposed use cases; demonstrations, presentations, tests, etc.; and issues/challenges. We also comment on early consumer-oriented products for spoken language translation. The aim is to provide a snapshot of one company's business possibilities and challenges at the dawn of the era of automatic interpreting.
This tutorial is for people who are beginning to evaluate how well machine translation will fit their needs or who are curious to know more about how it is used. We assume no previous knowledge of machine translation. We focus on background knowledge that will help you both get more out of the rest of AMTA2010 and to make better decisions about how to invest in machine translation. Past participants have ranged from tech writers and freelance translators who want to keep up to date to VPs and CEOs who are evaluating technology strategies for their organizations. The main topics for discussion are common FAQs about MT (Can machines really translate? Can we fire our translators now?) and limitations (Why is the output so bad? What is MT good for?), workflow (Why buy MT if it’s free on the internet? What other kinds of translation automation are there? How do we use it?), return on investment (How much does MT cost? How can we convince our bosses to buy MT?), and steps to deployment (Which MT system should we buy? What do we do next?).
Content localisation via machine translation (MT) is a sine qua non, especially for international online business. While most applications utilise rule-based solutions due to the lack of suitable in-domain parallel corpora for statistical MT (SMT) training, in this paper we investigate the possibility of applying SMT where huge amounts of monolingual content only are available. We describe a case study where an analysis of a very large amount of monolingual online trading data from eBay is conducted by ALS with a view to reducing this corpus to the most representative sample in order to ensure the widest possible coverage of the total data set. Furthermore, minimal yet optimal sets of sentences/words/terms are selected for generation of initial translation units for future SMT system-building.
Spoken Translation, Inc. (STI) of Berkeley, CA has developed a commercial system for interactive speech-to-speech machine translation designed for both high accuracy and broad linguistic and topical coverage. Planned use is in situations requiring both of these features, for example in helping Spanish-speaking patients to communicate with English-speaking doctors, nurses, and other health-care staff.
This paper reports on the development of a collocation extraction system that is designed within a commercial machine translation system in order to take advantage of the robust syntactic analysis that the system offers and to use this analysis to refine collocation extraction. Embedding the extraction system also addresses the need to provide information about the source language collocations in a system-specific form to support automatic generation of a collocation rulebase for analysis and translation.
An important part of the development of any machine translation system is the creation of lexical resources. We describe an analysis of the dictionary development workflow and supporting tools currently in use and under development at Logos. This workflow identifies the component processes of: setting goals, locating and acquiring lexical resources, transforming the resources to a common format, classifying and routing entries for special processing, importing entries, and verifying their adequacy in translation. Our approach has been to emphasize the tools necessary to support increased automation and use of resources available in electronic formats, in the context of a systematic workflow design.