Machine Translation Summit (2011)
Volumes
- Proceedings of Machine Translation Summit XIII: Plenaries 6 papers
- Proceedings of Machine Translation Summit XIII: Papers 68 papers
- Proceedings of Machine Translation Summit XIII: System Presentations 5 papers
- Proceedings of Machine Translation Summit XIII: Tutorial Abstracts 4 papers
- Proceedings of the 4th Workshop on Patent Translation 11 papers
up
Proceedings of Machine Translation Summit XIII: Papers
Methods for Smoothing the Optimizer Instability in SMT
Mauro Cettolo | Nicola Bertoldi | Marcello Federico
Mauro Cettolo | Nicola Bertoldi | Marcello Federico
Training Machine Translation with a Second-Order Taylor Approximation of Weighted Translation Instances
Aaron Phillips | Ralf Brown
Aaron Phillips | Ralf Brown
Maximum Rank Correlation Training for Statistical Machine Translation
Daqi Zheng | Yifan He | Yang Liu | Qun Liu
Daqi Zheng | Yifan He | Yang Liu | Qun Liu
POS Tagging of English Particles for Machine Translation
Jianjun Ma | Degen Huang | Haixia Liu | Wenfeng Sheng
Jianjun Ma | Degen Huang | Haixia Liu | Wenfeng Sheng
Multi-stage Chinese Dependency Parsing Based on Dependency Direction
Wenjing Lang | Qiaoli Zhou | Guiping Zhang | Dongfeng Cai
Wenjing Lang | Qiaoli Zhou | Guiping Zhang | Dongfeng Cai
Phonetic Representation-Based Speech Translation
Jie Jiang | Zeeshan Ahmed | Julie Carson-Berndsen | Peter Cahill | Andy Way
Jie Jiang | Zeeshan Ahmed | Julie Carson-Berndsen | Peter Cahill | Andy Way
Unsupervised Vocabulary Selection for Domain-Independent Simultaneous Lecture Translation
Paul Maergner | Ian Lane | Alex Waibel
Paul Maergner | Ian Lane | Alex Waibel
Context-aware Language Modeling for Conversational Speech Translation
Avneesh Saluja | Ian Lane | Ying Zhang
Avneesh Saluja | Ian Lane | Ying Zhang
Incremental Training and Intentional Over-fitting of Word Alignment
Qin Gao | Will Lewis | Chris Quirk | Mei-Yuh Hwang
Qin Gao | Will Lewis | Chris Quirk | Mei-Yuh Hwang
Alignment Inference and Bayesian Adaptation for Machine Translation
Kevin Duh | Katsuhito Sudoh | Tomoharu Iwata | Hajime Tsukada
Kevin Duh | Katsuhito Sudoh | Tomoharu Iwata | Hajime Tsukada
Multi-Strategy Approaches to Active Learning for Statistical Machine Translation
Vamshi Ambati | Stephan Vogel | Jaime Carbonell
Vamshi Ambati | Stephan Vogel | Jaime Carbonell
Document-level Consistency Verification in Machine Translation
Tong Xiao | Jingbo Zhu | Shujie Yao | Hao Zhang
Tong Xiao | Jingbo Zhu | Shujie Yao | Hao Zhang
Function Word Generation in Statistical Machine Translation Systems
Lei Cui | Dongdong Zhang | Mu Li | Ming Zhou
Lei Cui | Dongdong Zhang | Mu Li | Ming Zhou
Multimodal Building of Monolingual Dictionaries for Machine Translation by Non-Expert Users
Miquel Esplà-Gomis | Víctor M. Sánchez-Cartagena | Juan Antonio Pérez-Ortiz
Miquel Esplà-Gomis | Víctor M. Sánchez-Cartagena | Juan Antonio Pérez-Ortiz
Automatic Post-Editing based on SMT and its selective application by Sentence-Level Automatic Quality Evaluation
Hirokazu Suzuki
Hirokazu Suzuki
Qualitative Analysis of Post-Editing for High Quality Machine Translation
Frédéric Blain | Jean Senellart | Holger Schwenk | Mirko Plitt | Johann Roturier
Frédéric Blain | Jean Senellart | Holger Schwenk | Mirko Plitt | Johann Roturier
Using machine translation in computer-aided translation to suggest the target-side words to change
Miquel Esplà-Gomis | Felipe Sánchez-Martínez | Mikel L. Forcada
Miquel Esplà-Gomis | Felipe Sánchez-Martínez | Mikel L. Forcada
Improving Phrase Extraction via MBR Phrase Scoring and Pruning
Nan Duan | Mu Li | Ming Zhou | Lei Cui
Nan Duan | Mu Li | Ming Zhou | Lei Cui
Phrase Segmentation Model using Collocation and Translational Entropy
Hyoung-Gyu Lee | Joo-Young Lee | Min-Jeong Kim | Hae-Chang Rim | Joong-Hwi Shin | Young-Sook Hwang
Hyoung-Gyu Lee | Joo-Young Lee | Min-Jeong Kim | Hae-Chang Rim | Joong-Hwi Shin | Young-Sook Hwang
Singular or Plural? Exploiting Parallel Corpora for Chinese Number Prediction
Elizabeth Baran | Nianwen Xue
Elizabeth Baran | Nianwen Xue
Handling Multiword Expressions in Phrase-Based Statistical Machine Translation
Santanu Pal | Tanmoy Chakraborty | Sivaji Bandyopadhyay
Santanu Pal | Tanmoy Chakraborty | Sivaji Bandyopadhyay
A Unified and Discriminative Soft Syntactic Constraint Model for Hierarchical Phrase-based Translation
Lemao Liu | Tiejun Zhao | Chao Wang | Hailong Cao
Lemao Liu | Tiejun Zhao | Chao Wang | Hailong Cao
Simple but Effective Approaches to Improving Tree-to-tree Model
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Unpacking and Transforming Feature Functions: New Ways to Smooth Phrase Tables
Boxing Chen | Roland Kuhn | George Foster | Howard Johnson
Boxing Chen | Roland Kuhn | George Foster | Howard Johnson
Identification and Translation of Significant Patterns for Cross-Domain SMT Applications
Han-Bin Chen | Hen-Hsen Huang | Jengwei Tjiu | Ching-Ting Tan | Hsin-Hsi Chen
Han-Bin Chen | Hen-Hsen Huang | Jengwei Tjiu | Ching-Ting Tan | Hsin-Hsi Chen
Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component Level Mixture Modelling
Pratyush Banerjee | Sudip Kumar Naskar | Johann Roturier | Andy Way | Josef van Genabith
Pratyush Banerjee | Sudip Kumar Naskar | Johann Roturier | Andy Way | Josef van Genabith
Extracting Pre-ordering Rules from Chunk-based Dependency Trees for Japanese-to-English Translation
Xianchao Wu | Katsuhito Sudoh | Kevin Duh | Hajime Tsukada | Masaaki Nagata
Xianchao Wu | Katsuhito Sudoh | Kevin Duh | Hajime Tsukada | Masaaki Nagata
Post-ordering in Statistical Machine Translation
Katsuhito Sudoh | Xianchao Wu | Kevin Duh | Hajime Tsukada | Masaaki Nagata
Katsuhito Sudoh | Xianchao Wu | Kevin Duh | Hajime Tsukada | Masaaki Nagata
Searching Translation Memories for Paraphrases
Masao Utiyama | Graham Neubig | Takashi Onishi | Eiichiro Sumita
Masao Utiyama | Graham Neubig | Takashi Onishi | Eiichiro Sumita
Are numbers good enough for you? - A linguistically meaningful MT evaluation method
Takako Aikawa | Spencer Rarrick
Takako Aikawa | Spencer Rarrick
A Comparison of Unsupervised Bilingual Term Extraction Methods Using Phrase-Tables
Masamichi Ideue | Kazuhide Yamamoto | Masao Utiyama | Eiichiro Sumita
Masamichi Ideue | Kazuhide Yamamoto | Masao Utiyama | Eiichiro Sumita
Improving Low-Resource Statistical Machine Translation with a Novel Semantic Word Clustering Algorithm
Jeff Ma | Spyros Matsoukas | Richard Schwartz
Jeff Ma | Spyros Matsoukas | Richard Schwartz
Multi-granularity Word Alignment and Decoding for Agglutinative Language Translation
Zhiyang Wang | Yajuan Lü | Qun Liu
Zhiyang Wang | Yajuan Lü | Qun Liu
Lexical-based Reordering Model for Hierarchical Phrase-based Machine Translation
Zhongguang Zheng | Yao Meng | Hao Yu
Zhongguang Zheng | Yao Meng | Hao Yu
Parallel Corpus Refinement as an Outlier Detection Algorithm
Kaveh Taghipour | Shahram Khadivi | Jia Xu
Kaveh Taghipour | Shahram Khadivi | Jia Xu
Handheld Machine Translation System Based on Constraint Synchronous Grammar
Fai Wong | Francisco Oliveira | Sam Chao | Chi-Wai Tang
Fai Wong | Francisco Oliveira | Sam Chao | Chi-Wai Tang
A Comparison Study of Parsers for Patent Machine Translation
Isao Goto | Masao Utiyama | Takashi Onishi | Eiichiro Sumita
Isao Goto | Masao Utiyama | Takashi Onishi | Eiichiro Sumita
Rich Linguistic Features for Translation Memory-Inspired Consistent Translation
Yifan He | Yanjun Ma | Andy Way | Josef van Genabith
Yifan He | Yanjun Ma | Andy Way | Josef van Genabith
Japanese-Chinese Phrase Alignment Using Common Chinese Characters Information
Chenhui Chu | Toshiaki Nakazawa | Sadao Kurohashi
Chenhui Chu | Toshiaki Nakazawa | Sadao Kurohashi
The Cultivation of a Chinese-English-Japanese Trilingual Parallel Corpus from Comparable Patents
Bin Lu | Ka Po Chow | Benjamin K. Tsou
Bin Lu | Ka Po Chow | Benjamin K. Tsou
Example-Based Machine Translation for Low-Resource Language Using Chunk-String Templates
Md. Anwarus Salam Khan | Setsuo Yamada | Tetsuro Nishino
Md. Anwarus Salam Khan | Setsuo Yamada | Tetsuro Nishino
Improve SMT with Source-Side “Topic-Document” Distributions
Zhengxian Gong | Guodong Zhou | Liangyou Li
Zhengxian Gong | Guodong Zhou | Liangyou Li
Predicting Machine Translation Adequacy
Lucia Specia | Najeh Hajlaoui | Catalina Hallett | Wilker Aziz
Lucia Specia | Najeh Hajlaoui | Catalina Hallett | Wilker Aziz
Getting Expert Quality from the Crowd for Machine Translation Evaluation
Luisa Bentivogli | Marcello Federico | Giovanni Moretti | Michael Paul
Luisa Bentivogli | Marcello Federico | Giovanni Moretti | Michael Paul
A Framework for Diagnostic Evaluation of MT Based on Linguistic Checkpoints
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Andy Way
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Andy Way
Comparative Evaluation of Term Informativeness Measures in Machine Translation Evaluation Metrics
Billy Wong | Chunyu Kit
Billy Wong | Chunyu Kit
System Combination for Machine Translation Based on Text-to-Text Generation
Wei-Yun Ma | Kathleen Mckeown
Wei-Yun Ma | Kathleen Mckeown
Hybrid Machine Translation Guided by a Rule–Based System
Cristina España-Bonet | Gorka Labaka | Arantza Díaz de Ilarraza | Lluís Màrquez
Cristina España-Bonet | Gorka Labaka | Arantza Díaz de Ilarraza | Lluís Màrquez
Integrating shallow-transfer rules into phrase-based statistical machine translation
Víctor M. Sánchez-Cartagena | Felipe Sánchez-Martínez | Juan Antonio Pérez-Ortiz
Víctor M. Sánchez-Cartagena | Felipe Sánchez-Martínez | Juan Antonio Pérez-Ortiz
Study on the Impact Factors of the Translators’ Post-editing Efficiency in a Collaborative Translation Environment
Na Ye | Guiping Zhang
Na Ye | Guiping Zhang
up
Proceedings of Machine Translation Summit XIII: System Presentations
ENGtube: an Integrated Subtitle Environment for ESL
Chi-Ho Li | Shujie Liu | Chenguang Wang | Ming Zhou
Chi-Ho Li | Shujie Liu | Chenguang Wang | Ming Zhou
Broadcast news speech-to-text translation experiments
Sylvain Raybaud | David Langlois | Kamel Smaïli
Sylvain Raybaud | David Langlois | Kamel Smaïli
up
Proceedings of Machine Translation Summit XIII: Tutorial Abstracts
Over the past twenty years, we have attacked the historical methodological barriers between statistical machine translation and traditional models of syntax, semantics, and structure. In this tutorial, we will survey some of the central issues and techniques from each of these aspects, with an emphasis on `deeply theoretically integrated' models, rather than hybrid approaches such as superficial statistical aggregation or system combination of outputs produced by traditional symbolic components. On syntactic SMT, we will explore the trade-offs for SMT between learnability and representational expressiveness. After establishing a foundation in the theory and practice of stochastic transduction grammars, we will examine very recent new approaches to automatic unsupervised induction of various classes of transduction grammars. We will show why stochastic linear transduction grammars (LTGs and LITGs) and their preterminalized variants (PLITGs) are proving to be particularly intriguing models for the bootstrapping of inducing full-fledged stochastic inversion transduction grammars (ITGs). On semantic SMT, we will explore the trade-offs for SMT involved in applying various lexical semantics models. We will first examine word sense disambiguation, and discuss why traditional WSD models that are not deeply integrated within the SMT model tend, surprisingly, to fail. In contrast, we will show how a deeply embedded phrase sense disambiguation (PSD) approach succeeds where traditional WSD does not. We will then turn to semantic role labeling, and discuss the challenges of early approaches of applying SRL models to SMT. Finally, on semantic MT evaluation, we will explore some very new human and semi-automatic metrics based on semantic frame agreement. We show that by keeping the metrics deeply grounded within the theoretical framework of semantic frames, the new HMEANT and MEANT metrics can significantly outperform even the state-of-the-art expensive HTER and TER metrics, while at the same time maintaining the desirable characteristics of simplicity, inexpensiveness, and representational transparency.
From the Confidence Estimation of Machine Translation to the Integration of MT and Translation Memory
Yanjun Ma | Yifan He | Josef van Genabith
Yanjun Ma | Yifan He | Josef van Genabith
In this tutorial, we cover techniques that facilitate the integration of Machine Translation (MT) and Translation Memory (TM), which can help the adoption of MT technology in localisation industry. The tutorial covers four parts: i) brief introduction of MT and TM systems, ii) MT confidence estimation measures tailored for the TM environment, iii) segment-level MT and MT integration, iv) sub-segment level MT and TM integration, and v) human evaluation of MT and TM integration. We will first briefly describe and compare how translations are generated in MT and TM systems, and suggest possible avenues to combines these two systems. We will also cover current quality / cost estimation measures applied in MT and TM systems, such as the fuzzy-match score in the TM, and the evaluation/confidence metrics used to judge MT outputs. We then move on to introduce the recent developments in the field of MT confidence estimation tailored towards predicting post-editing efforts. We will especially focus on the confidence metrics proposed by Specia et al., which is shown to have high correlation with human preference, as well as post-editing time. For segment-level MT and TM integration, we present translation recommendation and translation re-ranking models, where the integration happens at the 1-best or the N-best level, respectively. Given an input to be translated, MT-TM recommendation compares the output from the MT and the TM systems, and presents the better one to the post-editor. MT-TM re-ranking, on the other hand, combines k-best lists from both systems, and generates a new list according to estimated post-editing effort. We observe high precision of these models in automatic and human evaluations, indicating that they can be integrated into TM environments without the risk of deteriorating the quality of the post-editing candidate. For sub-segment level MT and TM integration, we try to reuse high quality TM chunks to improve the quality of MT systems. We can also predict whether phrase pairs derived from fuzzy matches should be used to constrain the translation of an input segment. Using a series of linguistically- motivated features, our constraints lead both to more consistent translation output, and to improved translation quality, as is measured by automatic evaluation scores. Finally, we present several methodologies that can be used to track post-editing effort, perform human evaluation of MT-TM integration, or help translators to access MT outputs in a TM environment.
This half-day tutorial provides a broad overview of how to evaluate translations that are produced by machine translation systems. The range of issues covered includes a broad survey of both human evaluation measures and commonly-used automated metrics, and a review of how these are used for various types of evaluation tasks, such as assessing the translation quality of MT-translated sentences, comparing the performance of alternative MT systems, or measuring the productivity gains of incorporating MT into translation workflows.
Localization is a term mainly used in the software industry to designate the adaptation of products to meet local market needs. At the center of this process lies the translation of the most visible part of the product – the user interface – and the product documentation. Not surprisingly, the localization industry has therefore long been an extensive consumer of translation technology and a key contributor to its progress. Software products are typically released in recurrent cycles, with large amounts of content remaining unchanged or undergoing only minor modifications from one release to the next. In addition, software development cycles are short, forcing translation to start while the product is still undergoing changes, so that localized products can reach global markets in a timely fashion. These two aspects result in a heavy dependency on the efficient handling of translation updates. It is only natural that the software industry turned to software-based productivity tools to automate the recycling of translations (through translation memories) and to support the management of the translation workflow (through translation management systems). Machine translation is a relatively recent addition to the localization technology mix, and not yet as widely adopted as one would expect. Its initial use in the software industry was for more accessory content which is otherwise often left untranslated, e.g. product support articles and antivirus alerts with their short lifecycle. The expectation had however always been that MT could one day be deployed on the bulk of user interface and product documentation, due to the expected process efficiencies and cost savings. While MT is generally still not considered “good” enough to be used raw on this type of content, it has now become an integral part of translation productivity environments, thereby transforming translators into post-editors. The tutorial will provide an overview of current localization practices and challenges, with a special focus on the role of translation memory and translation management technologies. As a use case of the integration of MT in such an environment, we will then present the approach taken by Autodesk with its large set of Moses engines trained on custom data. Finally, we will explore typical scenarios in which machine translation is employed in the localization industry, using practical examples and data gathered in different productivity and usability tests.
up
Proceedings of the 4th Workshop on Patent Translation
Feedback Selecting of Manually Acquired Rules Using Automatic Evaluation
Xianhua Li | Yajuan Lü | Yao Meng | Qun Liu | Hao Yu
Xianhua Li | Yajuan Lü | Yao Meng | Qun Liu | Hao Yu
Investigation for Translation Disambiguation of Verbs in Patent Sentences using Word Grouping
Shoichi Yokoyama | Yuichi Takano
Shoichi Yokoyama | Yuichi Takano
Patent translation within the MOLTO project
Cristina España-Bonet | Ramona Enache | Adam Slaski | Aarne Ranta | Lluís Màrquez | Meritxell Gonzàlez
Cristina España-Bonet | Ramona Enache | Adam Slaski | Aarne Ranta | Lluís Màrquez | Meritxell Gonzàlez
Building a Statistical Machine Translation System for Translating Patent Documents
Jeff Ma | Spyros Matsoukas
Jeff Ma | Spyros Matsoukas