Greg Hanneman
2025
Findings of the WMT25 Shared Task on Automated Translation Evaluation Systems: Linguistic Diversity is Challenging and References Still Help
Alon Lavie | Greg Hanneman | Sweta Agrawal | Diptesh Kanojia | Chi-Kiu Lo | Vilém Zouhar | Frederic Blain | Chrysoula Zerva | Eleftherios Avramidis | Sourabh Deoghare | Archchana Sindhujan | Jiayi Wang | David Ifeoluwa Adelani | Brian Thompson | Tom Kocmi | Markus Freitag | Daniel Deutsch
Proceedings of the Tenth Conference on Machine Translation
Alon Lavie | Greg Hanneman | Sweta Agrawal | Diptesh Kanojia | Chi-Kiu Lo | Vilém Zouhar | Frederic Blain | Chrysoula Zerva | Eleftherios Avramidis | Sourabh Deoghare | Archchana Sindhujan | Jiayi Wang | David Ifeoluwa Adelani | Brian Thompson | Tom Kocmi | Markus Freitag | Daniel Deutsch
Proceedings of the Tenth Conference on Machine Translation
The WMT25 Shared Task on Automated Translation Evaluation Systems evaluates metrics and quality estimation systems that assess the quality of language translation systems. This task unifies and consolidates the separate WMT shared tasks on Machine Translation Evaluation Metrics and Quality Estimation from previous years. Our primary goal is to encourage the development and assessment of new state-of-the-art translation quality evaluation systems. The shared task this year consisted of three subtasks: (1) segment-level quality score prediction, (2) span-level translation error annotation, and (3) quality-informed segment-level error correction. The evaluation data for the shared task were provided by the General MT shared task and were complemented by “challenge sets” from both the organizers and participants. Task 1 results indicate the strong performance of large LLMs at the system level, whilereference-based baseline metrics outperform LLMs at the segment level. Task 2 results indicate that accurate error detection and balancing precision and recall are persistent challenges. Task 3 results show that minimal editing is challenging even when informed by quality indicators. Robustness across the broad diversity of languages remains a major challenge across all three subtasks.
2024
Impacts of Misspelled Queries on Translation and Product Search
Greg Hanneman | Natawut Monaikul | Taichi Nakatani
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Greg Hanneman | Natawut Monaikul | Taichi Nakatani
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Machine translation is used in e-commerce to translate second-language queries into the primary language of the store, to be matched by the search system against the product catalog. However, many queries contain spelling mistakes. We first present an analysis of the spelling-robustness of a population of MT systems, quantifying how spelling variations affect MT output, the list of returned products, and ultimately user behavior. We then present two sets of practical experiments illustrating how spelling-robustness may be specifically improved. For MT, reducing the number of BPE operations significantly improves spelling-robustness in six language pairs. In end-to-end e-commerce, the inclusion of a dedicated spelling correction model, and the augmentation of that model’s training data with language-relevant phenomena, each improve robustness and consistency of search results.
2020
How Should Markup Tags Be Translated?
Greg Hanneman | Georgiana Dinu
Proceedings of the Fifth Conference on Machine Translation
Greg Hanneman | Georgiana Dinu
Proceedings of the Fifth Conference on Machine Translation
The ability of machine translation (MT) models to correctly place markup is crucial to generating high-quality translations of formatted input. This paper compares two commonly used methods of representing markup tags and tests the ability of MT models to learn tag placement via training data augmentation. We study the interactions of tag representation, data augmentation size, tag complexity, and language pair to show the drawbacks and benefits of each method. We construct and release new test sets containing tagged data for three language pairs of varying difficulty.
2018
Leveraging Data Resources for Cross-Linguistic Information Retrieval Using Statistical Machine Translation
Steve Sloto | Ann Clifton | Greg Hanneman | Patrick Porter | Donna Gates | Almut Hildebrand | Anish Kumar
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)
Steve Sloto | Ann Clifton | Greg Hanneman | Patrick Porter | Donna Gates | Almut Hildebrand | Anish Kumar
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)
2014
The CMU Machine Translation Systems at WMT 2014
Austin Matthews | Waleed Ammar | Archna Bhatia | Weston Feely | Greg Hanneman | Eva Schlinger | Swabha Swayamdipta | Yulia Tsvetkov | Alon Lavie | Chris Dyer
Proceedings of the Ninth Workshop on Statistical Machine Translation
Austin Matthews | Waleed Ammar | Archna Bhatia | Weston Feely | Greg Hanneman | Eva Schlinger | Swabha Swayamdipta | Yulia Tsvetkov | Alon Lavie | Chris Dyer
Proceedings of the Ninth Workshop on Statistical Machine Translation
2013
Improving Syntax-Augmented Machine Translation by Coarsening the Label Set
Greg Hanneman | Alon Lavie
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Greg Hanneman | Alon Lavie
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References
Waleed Ammar | Victor Chahuneau | Michael Denkowski | Greg Hanneman | Wang Ling | Austin Matthews | Kenton Murray | Nicola Segall | Alon Lavie | Chris Dyer
Proceedings of the Eighth Workshop on Statistical Machine Translation
Waleed Ammar | Victor Chahuneau | Michael Denkowski | Greg Hanneman | Wang Ling | Austin Matthews | Kenton Murray | Nicola Segall | Alon Lavie | Chris Dyer
Proceedings of the Eighth Workshop on Statistical Machine Translation
2012
The CMU-Avenue French-English Translation System
Michael Denkowski | Greg Hanneman | Alon Lavie
Proceedings of the Seventh Workshop on Statistical Machine Translation
Michael Denkowski | Greg Hanneman | Alon Lavie
Proceedings of the Seventh Workshop on Statistical Machine Translation
2011
Automatic Category Label Coarsening for Syntax-Based Machine Translation
Greg Hanneman | Alon Lavie
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Greg Hanneman | Alon Lavie
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
A General-Purpose Rule Extractor for SCFG-Based Machine Translation
Greg Hanneman | Michelle Burroughs | Alon Lavie
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Greg Hanneman | Michelle Burroughs | Alon Lavie
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
CMU Syntax-Based Machine Translation at WMT 2011
Greg Hanneman | Alon Lavie
Proceedings of the Sixth Workshop on Statistical Machine Translation
Greg Hanneman | Alon Lavie
Proceedings of the Sixth Workshop on Statistical Machine Translation
2010
Improved Features and Grammar Selection for Syntax-Based MT
Greg Hanneman | Jonathan Clark | Alon Lavie
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Greg Hanneman | Jonathan Clark | Alon Lavie
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
2009
Machine Translation System Combination with Flexible Word Ordering
Kenneth Heafield | Greg Hanneman | Alon Lavie
Proceedings of the Fourth Workshop on Statistical Machine Translation
Kenneth Heafield | Greg Hanneman | Alon Lavie
Proceedings of the Fourth Workshop on Statistical Machine Translation
An Improved Statistical Transfer System for French-English Machine Translation
Greg Hanneman | Vamshi Ambati | Jonathan H. Clark | Alok Parlikar | Alon Lavie
Proceedings of the Fourth Workshop on Statistical Machine Translation
Greg Hanneman | Vamshi Ambati | Jonathan H. Clark | Alok Parlikar | Alon Lavie
Proceedings of the Fourth Workshop on Statistical Machine Translation
Decoding with Syntactic and Non-Syntactic Phrases in a Syntax-Based Machine Translation System
Greg Hanneman | Alon Lavie
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009
Greg Hanneman | Alon Lavie
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009
2008
Search
Fix author
Co-authors
- Alon Lavie 13
- Vamshi Ambati 2
- Waleed Ammar 2
- Jonathan H. Clark 2
- Michael Denkowski 2
- Chris Dyer 2
- Austin Matthews 2
- Alok Parlikar 2
- David Ifeoluwa Adelani 1
- Abhaya Agarwal 1
- Sweta Agrawal 1
- Eleftherios Avramidis 1
- Archna Bhatia 1
- Frédéric Blain 1
- Michelle Burroughs 1
- Victor Chahuneau 1
- Ann Clifton 1
- Sourabh Deoghare 1
- Daniel Deutsch 1
- Georgiana Dinu 1
- Weston Feely 1
- Markus Freitag 1
- Donna Gates 1
- Kenneth Heafield 1
- Almut Silja Hildebrand 1
- Edmund Huber 1
- Diptesh Kanojia 1
- Tom Kocmi 1
- Anish Kumar 1
- Wang Ling 1
- Chi-kiu Lo 1
- Natawut Monaikul 1
- Kenton Murray 1
- Taichi Nakatani 1
- Erik Peterson 1
- Patrick Porter 1
- Eva Schlinger 1
- Nicola Segall 1
- Archchana Sindhujan 1
- Steve Sloto 1
- Swabha Swayamdipta 1
- Brian Thompson 1
- Yulia Tsvetkov 1
- Jiayi Wang 1
- Chrysoula Zerva 1
- Vilém Zouhar 1