Andreas Zollmann


2011

pdf bib
A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
Andreas Zollmann | Stephan Vogel
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf
New Parameterizations and Features for PSCFG-Based Machine Translation
Andreas Zollmann | Stephan Vogel
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

2009

pdf
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf
Wider Pipelines: N-Best Alignments and Parses in MT Training
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

State-of-the-art statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline of independently developed components, errors made in these “single-best” hypotheses can propagate to downstream estimation steps that treat these inputs as clean, trustworthy training data. In this work we integrate N-best alignments and parses by using a probability distribution over these alternatives to generate posterior fractional counts for use in downstream estimation. Using these fractional counts in a DOP-inspired syntax-based translation system, we show significant improvements in translation quality over a single-best trained baseline.

pdf bib
The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.

pdf
A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT
Andreas Zollmann | Ashish Venugopal | Franz Och | Jay Ponte
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf
The Syntax Augmented MT (SAMT) System at the Shared Task for the 2007 ACL Workshop on Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Matthias Paulik | Stephan Vogel
Proceedings of the Second Workshop on Statistical Machine Translation

pdf
An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT
Ashish Venugopal | Andreas Zollmann | Stephan Vogel
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf
The CMU-UKA statistical machine translation systems for IWSLT 2007
Ian Lane | Andreas Zollmann | Thuy Linh Nguyen | Nguyen Bach | Ashish Venugopal | Stephan Vogel | Kay Rottmann | Ying Zhang | Alex Waibel
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese→English, Chinese→English and Arabic→English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japanese→English we focused on two problems: first, punctuation recovery, and second, how to incorporate topic-knowledge into the translation framework. Our Chinese→English submission focused on syntax-augmented SMT and for the Arabic→English task we focused on incorporating morphological-decomposition into the SMT framework. This research strategy enabled us to evaluate a wide variety of approaches which proved effective for the language pairs they were evaluated on.

2006

pdf
Syntax Augmented Machine Translation via Chart Parsing
Andreas Zollmann | Ashish Venugopal
Proceedings on the Workshop on Statistical Machine Translation

pdf
The CMU-UKA syntax augmented machine translation system for IWSLT-06
Andreas Zollmann | Ashish Venugopal | Stephan Vogel | Alex Waibel
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf
Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

2005

pdf
Training and Evaluating Error Minimization Decision Rules for Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Alex Waibel
Proceedings of the ACL Workshop on Building and Using Parallel Texts