Philipp Koehn
2025
Learn and Unlearn: Addressing Misinformation in Multilingual LLMs
TaiMing Lu | Philipp Koehn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
TaiMing Lu | Philipp Koehn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
This paper investigates the propagation of information in multilingual large language models (LLMs) and evaluates the efficacy of various unlearning methods. We demonstrate that fake information, regardless of the language it is in, once introduced into these models through training data, can spread across different languages, compromising the integrity and reliability of the generated content. Our findings reveal that standard unlearning techniques, which typically focus on English data, are insufficient in mitigating the spread of harmful content in multilingual contexts and could inadvertently reinforce harmful content across languages. We show that only by addressing harmful responses in both English and the original language of the harmful data we can effectively eliminate it for all languages. This underscores the critical need for comprehensive unlearning strategies that consider the multilingual nature of modern LLMs to enhance their safety and reliability across landscapes.
Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents
Chutong Meng | Philipp Koehn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Chutong Meng | Philipp Koehn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
We present Speech Vecalign, a parallel speech document alignment method that monotonically aligns speech segment embeddings and does not depend on text transcriptions. Compared to the baseline method Global Mining, a variant of speech mining, Speech Vecalign produces longer speech-to-speech alignments. It also demonstrates greater robustness than Local Mining, another speech mining variant, as it produces less noise. We applied Speech Vecalign to 3,000 hours of unlabeled parallel English-German (En-De) speech documents from VoxPopuli, yielding about 1,000 hours of high-quality alignments. We then trained En-De speech-to-speech translation models on the aligned data. Speech Vecalign improves the En-to-De and De-to-En performance over Global Mining by 0.37 and 0.18 ASR-BLEU, respectively. Moreover, our models match or outperform SpeechMatrix model performance, despite using 8 times fewer raw speech documents.
Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation
Weiting Tan | Jiachen Lian | Hirofumi Inaguma | Paden Tomasello | Philipp Koehn | Xutai Ma
Findings of the Association for Computational Linguistics: EMNLP 2025
Weiting Tan | Jiachen Lian | Hirofumi Inaguma | Paden Tomasello | Philipp Koehn | Xutai Ma
Findings of the Association for Computational Linguistics: EMNLP 2025
We present an Audio-Visual Language Model (AVLM) for expressive speech generation by integrating full-face visual cues into a pre-trained expressive speech model. We explore multiple visual encoders and multimodal fusion strategies during pre-training to identify the most effective integration approach. Subsequent fine-tuning on emotion recognition and expressive dialogue tasks yields substantial gains over speech-only baselines (e.g., +5 F1 in emotion recognition). AVLM highlights the value of expressive visual information in guiding speech generation and offers a foundation for end-to-end multimodal conversational systems.
HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation
Shijie Zhang | Renhao Li | Songsheng Wang | Philipp Koehn | Min Yang | Derek F. Wong
Findings of the Association for Computational Linguistics: EMNLP 2025
Shijie Zhang | Renhao Li | Songsheng Wang | Philipp Koehn | Min Yang | Derek F. Wong
Findings of the Association for Computational Linguistics: EMNLP 2025
The advancement of Large Language Models (LLMs) enables flexible and interpretable automatic evaluations. In the field of machine translation evaluation, utilizing LLMs with translation error annotations based on Multidimensional Quality Metrics (MQM) yields more human-aligned judgments. However, current LLM-based evaluation methods still face challenges in accurately identifying error spans and assessing their severity. In this paper, we propose HiMATE, a Hierarchical Multi-Agent Framework for Machine Translation Evaluation. We argue that existing approaches inadequately exploit the fine-grained structural and semantic information within the MQM hierarchy. To address this, we develop a hierarchical multi-agent system grounded in the MQM error typology, enabling granular evaluation of subtype errors. Two key strategies are incorporated to further mitigate systemic hallucinations within the framework: the utilization of the model’s self-reflective capability and the facilitation of agent discussion involving asymmetric information. Empirically, HiMATE outperforms competitive baselines across different datasets in conducting human-aligned evaluations. Further analyses underscore its significant advantage in error span detection and severity assessment, achieving an average F1-score improvement of 89% over the best-performing baseline. We make our code and data publicly available at https://github.com/nlp2ct-shijie/HiMATE.
Streaming Sequence Transduction through Dynamic Compression
Weiting Tan | Yunmo Chen | Tongfei Chen | Guanghui Qin | Haoran Xu | Chenyu Zhang | Benjamin Van Durme | Philipp Koehn
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Weiting Tan | Yunmo Chen | Tongfei Chen | Guanghui Qin | Haoran Xu | Chenyu Zhang | Benjamin Van Durme | Philipp Koehn
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams. STAR dynamically segments input streams to create compressed anchor representations, achieving nearly lossless (12x) compression in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover, STAR demonstrates superior segmentation and latency-quality trade-offs in simultaneous Speech Translation, optimizing latency, memory footprint, and quality.
Proceedings of the Tenth Conference on Machine Translation
Barry Haddow | Tom Kocmi | Philipp Koehn | Christof Monz
Proceedings of the Tenth Conference on Machine Translation
Barry Haddow | Tom Kocmi | Philipp Koehn | Christof Monz
Proceedings of the Tenth Conference on Machine Translation
Findings of the WMT25 General Machine Translation Shared Task: Time to Stop Evaluating on Easy Test Sets
Tom Kocmi | Ekaterina Artemova | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Konstantin Dranch | Anton Dvorkovich | Sergey Dukanov | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Marzena Karpinska | Philipp Koehn | Howard Lakougna | Jessica Lundin | Christof Monz | Kenton Murray | Masaaki Nagata | Stefano Perrella | Lorenzo Proietti | Martin Popel | Maja Popović | Parker Riley | Mariya Shmatova | Steinthór Steingrímsson | Lisa Yankovskaya | Vilém Zouhar
Proceedings of the Tenth Conference on Machine Translation
Tom Kocmi | Ekaterina Artemova | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Konstantin Dranch | Anton Dvorkovich | Sergey Dukanov | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Marzena Karpinska | Philipp Koehn | Howard Lakougna | Jessica Lundin | Christof Monz | Kenton Murray | Masaaki Nagata | Stefano Perrella | Lorenzo Proietti | Martin Popel | Maja Popović | Parker Riley | Mariya Shmatova | Steinthór Steingrímsson | Lisa Yankovskaya | Vilém Zouhar
Proceedings of the Tenth Conference on Machine Translation
This paper presents the results of the General Machine Translation Task organized as part of the 2025 Conference on Machine Translation (WMT). Participants were invited to build systems for any of 30 language pairs. For half of these pairs, we conducted a human evaluation on test sets spanning four to five different domains.We evaluated 60 systems in total: 36 submitted by participants and 24 for which we collected translations from large language models (LLMs) and popular online translation providers.This year, we focused on creating challenging test sets by developing a difficulty sampling technique and using more complex source data. We evaluated system outputs with professional annotators using the Error Span Annotation (ESA) protocol, except for two language pairs, for which we used Multidimensional Quality Metrics (MQM) instead.We continued the trend of increasingly moving towards document-level translation, providing the source texts as whole documents containing multiple paragraphs.
Findings of the WMT25 Multilingual Instruction Shared Task: Persistent Hurdles in Reasoning, Generation, and Evaluation
Tom Kocmi | Sweta Agrawal | Ekaterina Artemova | Eleftherios Avramidis | Eleftheria Briakou | Pinzhen Chen | Marzieh Fadaee | Markus Freitag | Roman Grundkiewicz | Yupeng Hou | Philipp Koehn | Julia Kreutzer | Saab Mansour | Stefano Perrella | Lorenzo Proietti | Parker Riley | Eduardo Sánchez | Patricia Schmidtova | Mariya Shmatova | Vilém Zouhar
Proceedings of the Tenth Conference on Machine Translation
Tom Kocmi | Sweta Agrawal | Ekaterina Artemova | Eleftherios Avramidis | Eleftheria Briakou | Pinzhen Chen | Marzieh Fadaee | Markus Freitag | Roman Grundkiewicz | Yupeng Hou | Philipp Koehn | Julia Kreutzer | Saab Mansour | Stefano Perrella | Lorenzo Proietti | Parker Riley | Eduardo Sánchez | Patricia Schmidtova | Mariya Shmatova | Vilém Zouhar
Proceedings of the Tenth Conference on Machine Translation
The WMT25 Multilingual Instruction Shared Task (MIST) introduces a benchmark to evaluate large language models (LLMs) across 30 languages. The benchmark covers five types of problems: machine translation, linguistic reasoning, open-ended generation, cross-lingual summarization, and LLM-as-a-judge.We provide automatic evaluation and collect human annotations, which highlight the limitations of automatic evaluation and allow further research into metric meta-evaluation. We run on our benchmark a diverse set of open- and closed-weight LLMs, providing a broad assessment of the multilingual capabilities of current LLMs. Results highlight substantial variation across sub-tasks and languages, revealing persistent challenges in reasoning, cross-lingual generation, and evaluation reliability. This work establishes a standardized framework for measuring future progress in multilingual LLM development.
Findings of the WMT 2025 Shared Task of the Open Language Data Initiative
David Dale | Laurie Burchell | Jean Maillard | Idris Abdulmumin | Antonios Anastasopoulos | Isaac Caswell | Philipp Koehn
Proceedings of the Tenth Conference on Machine Translation
David Dale | Laurie Burchell | Jean Maillard | Idris Abdulmumin | Antonios Anastasopoulos | Isaac Caswell | Philipp Koehn
Proceedings of the Tenth Conference on Machine Translation
We present the results of the WMT 2025 shared task of the Open Language Data Initiative. Participants were invited to contribute to the massively multilingual open datasets (FLORES+, MT Seed, WMT24++) or create new such resources. We accepted 8 submissions, including 7 extensions or revisions of the existing datasets and one submission with a new parallel training dataset, SMOL.
2024
Can Synthetic Speech Improve End-to-End Conversational Speech Translation?
Bismarck Bamfo Odoom | Nathaniel Robinson | Elijah Rippeth | Luis Tavarez-Arce | Kenton Murray | Matthew Wiesner | Paul McNamee | Philipp Koehn | Kevin Duh
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Bismarck Bamfo Odoom | Nathaniel Robinson | Elijah Rippeth | Luis Tavarez-Arce | Kenton Murray | Matthew Wiesner | Paul McNamee | Philipp Koehn | Kevin Duh
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Conversational speech translation is an important technology that fosters communication among people of different language backgrounds. Three-way parallel data in the form of source speech, source transcript, and target translation is usually required to train end-to-end systems. However, such datasets are not readily available and are expensive to create as this involves multiple annotation stages. In this paper, we investigate the use of synthetic data from generative models, namely machine translation and text-to-speech synthesis, for training conversational speech translation systems. We show that adding synthetic data to the training recipe increasingly improves end-to-end training performance, especially when limited real data is available. However, when no real data is available, no amount of synthetic data helps.
Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles
Weiting Tan | Haoran Xu | Lingfeng Shen | Shuyue Stella Li | Kenton Murray | Philipp Koehn | Benjamin Van Durme | Yunmo Chen
Findings of the Association for Computational Linguistics: NAACL 2024
Weiting Tan | Haoran Xu | Lingfeng Shen | Shuyue Stella Li | Kenton Murray | Philipp Koehn | Benjamin Van Durme | Yunmo Chen
Findings of the Association for Computational Linguistics: NAACL 2024
Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations are relatively good, there remains a discernible gap comparing their performance with the few-shot setting. In this paper, we investigate the factors contributing to this gap and find that this gap can largely be closed (for about 70%) by matching the writing styles of the target corpus. Additionally, we explore potential approaches to enhance zero-shot baselines without the need for parallel demonstration examples, providing valuable insights into how these methods contribute to improving translation metrics.
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
Lingfeng Shen | Weiting Tan | Sihao Chen | Yunmo Chen | Jingyu Zhang | Haoran Xu | Boyuan Zheng | Philipp Koehn | Daniel Khashabi
Findings of the Association for Computational Linguistics: ACL 2024
Lingfeng Shen | Weiting Tan | Sihao Chen | Yunmo Chen | Jingyu Zhang | Haoran Xu | Boyuan Zheng | Philipp Koehn | Daniel Khashabi
Findings of the Association for Computational Linguistics: ACL 2024
As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages,we observe that (1) LLMs tend to generate unsafe responses much more often when a malicious prompt is written in a lower-resource language, and (2) LLMs tend to generate more irrelevant responses to malicious prompts in lower-resource languages. To understand where the discrepancy can be attributed, we study the effect of instruction tuning with reinforcement learning from human feedback (RLHF) or supervised finetuning (SFT) on the HH-RLHF dataset. Surprisingly, while training with high-resource languages improves model alignment, training in lower-resource languages yields minimal improvement. This suggests that the bottleneck of cross-lingual alignment is rooted in the pretraining stage. Our findings highlight the challenges in cross-lingual LLM safety, and we hope they inform future research in this direction.
Recovering document annotations for sentence-level bitext
Rachel Wicks | Matt Post | Philipp Koehn
Findings of the Association for Computational Linguistics: ACL 2024
Rachel Wicks | Matt Post | Philipp Koehn
Findings of the Association for Computational Linguistics: ACL 2024
In machine translation, historical models were incapable of handling longer contexts, so the lack of document-level datasets was less noticeable. Now, despite the emergence of long-sequence methods, we remain within a sentence-level paradigm and without data to adequately approach context-aware machine translation. Most large-scale datasets have been processed through a pipeline that discards document-level metadata. In this work, we reconstruct document-level information for three (ParaCrawl, News Commentary, and Europarl) large datasets in German, French, Spanish, Italian, Polish, and Portuguese (paired with English). We then introduce a document-level filtering technique as an alternative to traditional bitext filtering. We present this filtering with analysis to show that this method prefers context-consistent translations rather than those that may have been sentence-level machine translated. Last we train models on these longer contexts and demonstrate improvement in document-level translation without degradation of sentence-level translation. We release our dataset, ParaDocs, and resulting models as a resource to the community.
Pointer-Generator Networks for Low-Resource Machine Translation: Don’t Copy That!
Niyati Bafna | Philipp Koehn | David Yarowsky
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP
Niyati Bafna | Philipp Koehn | David Yarowsky
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP
While Transformer-based neural machine translation (NMT) is very effective in high-resource settings, many languages lack the necessary large parallel corpora to benefit from it. In the context of low-resource (LR) MT between two closely-related languages, a natural intuition is to seek benefits from structural “shortcuts”, such as copying subwords from the source to the target, given that such language pairs often share a considerable number of identical words, cognates, and borrowings. We test Pointer-Generator Networks for this purpose for six language pairs over a variety of resource ranges, and find weak improvements for most settings. However, analysis shows that the model does not show greater improvements for closely-related vs. more distant language pairs, or for lower resource ranges, and that the models do not exhibit the expected usage of the mechanism for shared subwords. Our discussion of the reasons for this behaviour highlights several general challenges for LR NMT, such as modern tokenization strategies, noisy real-world conditions, and linguistic complexities. We call for better scrutiny of linguistically motivated improvements to NMT given the blackbox nature of Transformer models, as well as for a focus on the above problems in the field.
Speech Data from Radio Broadcasts for Low Resource Languages
Bismarck Bamfo Odoom | Paola Leibny Garcia | Prangthip Hansanti | Loïc Barrault | Christophe Ropers | Matthew Wiesner | Kenton Murray | Alex Mourachko | Philipp Koehn
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
Bismarck Bamfo Odoom | Paola Leibny Garcia | Prangthip Hansanti | Loïc Barrault | Christophe Ropers | Matthew Wiesner | Kenton Murray | Alex Mourachko | Philipp Koehn
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
We created a collection of speech data for 48 low resource languages. The corpus is extracted from radio broadcasts and processed with novel speech detection and language identification models based on a manually vetted subset of the audio for 10 languages. The data is made publicly available.
Where are you from? Geolocating Speech and Applications to Language Identification
Patrick Foley | Matthew Wiesner | Bismarck Bamfo Odoom | Leibny Paola Garcia | Kenton Murray | Philipp Koehn
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Patrick Foley | Matthew Wiesner | Bismarck Bamfo Odoom | Leibny Paola Garcia | Kenton Murray | Philipp Koehn
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
We train models to answer the question, Where are you from? and show how such models can be repurposed for language identification (LID). To our knowledge, this paper is the first to introduce data sources, methods and models to tackle the task of geolocation of speech at a global scale, and the first to explore using geolocation as a proxy-task for LID. Specifically, we explore whether radio broadcasts with known origin can be used to train regression and classification-based models for geolocating speech. We build models on top of self-supervised pretrained models, using attention pooling to qualitatively verify that the model geolocates the speech itself, and not other channel artifacts.The best geolocation models localize speaker origin to around 650km. We confirm the value of speech geolocation as a proxy task by using speech geolocation models for zero-shot LID. Finally, we show that fine-tuning geolocation models for LID outperforms fine-tuning pretrained Wav2Vec2.0 models, and achieves state-of-the-art performance on the FLEURS benchmark.
Proceedings of the Ninth Conference on Machine Translation
Barry Haddow | Tom Kocmi | Philipp Koehn | Christof Monz
Proceedings of the Ninth Conference on Machine Translation
Barry Haddow | Tom Kocmi | Philipp Koehn | Christof Monz
Proceedings of the Ninth Conference on Machine Translation
Findings of the WMT24 General Machine Translation Shared Task: The LLM Era Is Here but MT Is Not Solved Yet
Tom Kocmi | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Marzena Karpinska | Philipp Koehn | Benjamin Marie | Christof Monz | Kenton Murray | Masaaki Nagata | Martin Popel | Maja Popović | Mariya Shmatova | Steinthór Steingrímsson | Vilém Zouhar
Proceedings of the Ninth Conference on Machine Translation
Tom Kocmi | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Marzena Karpinska | Philipp Koehn | Benjamin Marie | Christof Monz | Kenton Murray | Masaaki Nagata | Martin Popel | Maja Popović | Mariya Shmatova | Steinthór Steingrímsson | Vilém Zouhar
Proceedings of the Ninth Conference on Machine Translation
This overview paper presents the results of the General Machine Translation Task organised as part of the 2024 Conference on Machine Translation (WMT). In the general MT task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting of three to five different domains. In addition to participating systems, we collected translations from 8 different large language models (LLMs) and 4 online translation providers. We evaluate system outputs with professional human annotators using a new protocol called Error Span Annotations (ESA).
Findings of the WMT 2024 Shared Task of the Open Language Data Initiative
Laurie Burchell | Jean Maillard | Antonios Anastasopoulos | Christian Federmann | Philipp Koehn | Skyler Wang
Proceedings of the Ninth Conference on Machine Translation
Laurie Burchell | Jean Maillard | Antonios Anastasopoulos | Christian Federmann | Philipp Koehn | Skyler Wang
Proceedings of the Ninth Conference on Machine Translation
We present the results of the WMT 2024 shared task of the Open Language Data Initiative. Participants were invited to contribute to the FLORES+ and MT Seed multilingual datasets, two foundational open resources that facilitate the organic expansion of language technology’s reach. We accepted ten submissions covering 16 languages, which extended the range of languages included in the datasets and improved the quality of existing data.
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation
Longyue Wang | Siyou Liu | Chenyang Lyu | Wenxiang Jiao | Xing Wang | Jiahao Xu | Zhaopeng Tu | Yan Gu | Weiyu Chen | Minghao Wu | Liting Zhou | Philipp Koehn | Andy Way | Yulin Yuan
Proceedings of the Ninth Conference on Machine Translation
Longyue Wang | Siyou Liu | Chenyang Lyu | Wenxiang Jiao | Xing Wang | Jiahao Xu | Zhaopeng Tu | Yan Gu | Weiyu Chen | Minghao Wu | Liting Zhou | Philipp Koehn | Andy Way | Yulin Yuan
Proceedings of the Ninth Conference on Machine Translation
Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the second edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 10 submissions from 5 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at https://www2.statmt.org/wmt24/literary-translation-task.html.
Benchmarking Visually-Situated Translation of Text in Natural Images
Elizabeth Salesky | Philipp Koehn | Matt Post
Proceedings of the Ninth Conference on Machine Translation
Elizabeth Salesky | Philipp Koehn | Matt Post
Proceedings of the Ninth Conference on Machine Translation
We introduce a benchmark, Vistra, for visually-situated translation of English text in natural images to four target languages. We describe the dataset construction and composition. We benchmark open-source and commercial OCR and MT models on Vistra, and present both quantitative results and a taxonomy of common OCR error classes with their effect on downstream MT. Finally, we assess direct image-to-text translation with a multimodal LLM, and show that it is able in some cases but not yet consistently to disambiguate possible translations with visual context. We show that this is an unsolved and challenging task even for strong commercial models. We hope that the creation and release of this benchmark which is the first of its kind for these language pairs will encourage further research in this direction.
Neural Methods for Aligning Large-Scale Parallel Corpora from the Web for South and East Asian Languages
Philipp Koehn
Proceedings of the Ninth Conference on Machine Translation
Philipp Koehn
Proceedings of the Ninth Conference on Machine Translation
We introduce neural methods and a toxicity filtering step to the hierarchical web mining approach of Paracrawl (Bañón et al., 2020), showing large improvements. We apply these methods to web-scale parallel corpus mining for 9 South and East Asian national languages, creating training resources for machine translation that yield better translation quality for most of these languages than existing publicly available datasets in OPUS. Our methods also generally lead to better results than the global mining approach of Schwenk et al. (2021).
2023
Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation
Jean Maillard | Cynthia Gao | Elahe Kalbassi | Kaushik Ram Sadagopan | Vedanuj Goswami | Philipp Koehn | Angela Fan | Francisco Guzman
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jean Maillard | Cynthia Gao | Elahe Kalbassi | Kaushik Ram Sadagopan | Vedanuj Goswami | Philipp Koehn | Angela Fan | Francisco Guzman
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
For many languages, machine translation progress is hindered by the lack of reliable training data. Models are trained on whatever pre-existing datasets may be available and then augmented with synthetic data, because it is often not economical to pay for the creation of large-scale datasets. But for the case of low-resource languages, would the creation of a few thousand professionally translated sentence pairs give any benefit? In this paper, we show that it does. We describe a broad data collection effort involving around 6k professionally translated sentence pairs for each of 39 low-resource languages, which we make publicly available. We analyse the gains of models trained on this small but high-quality data, showing that it has significant impact even when larger but lower quality pre-existing corpora are used, or when data is augmented with millions of sentences through backtranslation.
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan | Kevin Heffernan | Holger Schwenk | Philipp Koehn
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Weiting Tan | Kevin Heffernan | Holger Schwenk | Philipp Koehn
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks. In this paper, we integrate contrastive learning into multilingual representation distillation and use it for quality estimation of parallel sentences (i.e., find semantically similar sentences that can be used as translations of each other). We validate our approach with multilingual similarity search and corpus filtering tasks. Experiments across different low-resource languages show that our method greatly outperforms previous sentence encoders such as LASER, LASER3, and LaBSE.
Condensing Multilingual Knowledge with Lightweight Language-Specific Modules
Haoran Xu | Weiting Tan | Shuyue Li | Yunmo Chen | Benjamin Van Durme | Philipp Koehn | Kenton Murray
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Haoran Xu | Weiting Tan | Shuyue Li | Yunmo Chen | Benjamin Van Durme | Philipp Koehn | Kenton Murray
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Incorporating language-specific (LS) modules or Mixture-of-Experts (MoE) are proven methods to boost performance in multilingual model performance, but the scalability of these approaches to hundreds of languages or experts tends to be hard to manage. We present Language-specific Matrix Synthesis (LMS), a novel method that addresses the issue. LMS utilizes parameter-efficient and lightweight modules, reducing the number of parameters while outperforming existing methods, e.g., +1.73 BLEU over Switch Transformer on OPUS-100 multilingual translation. Additionally, we introduce Fuse Distillation (FD) to condense multilingual knowledge from multiple LS modules into a single shared module, improving model inference and storage efficiency. Our approach demonstrates superior scalability and performance compared to state-of-the-art methods.
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky | Neha Verma | Philipp Koehn | Matt Post
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Elizabeth Salesky | Neha Verma | Philipp Koehn | Matt Post
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations. We experiment with two different data settings with a variety of language and script coverage, demonstrating improved performance compared to subword embeddings. We explore various properties of pixel representations such as parameter sharing within and across scripts to better understand where they lead to positive transfer. We observe that these properties not only enable seamless cross-lingual transfer to unseen scripts, but make pixel representations more data-efficient than alternatives such as vocabulary expansion. We hope this work contributes to more extensible multilingual models for all languages and scripts.
Learning from Mistakes: Towards Robust Neural Machine Translation for Disfluent L2 Sentences
Shuyue Stella Li | Philipp Koehn
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Shuyue Stella Li | Philipp Koehn
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
We study the sentences written by second-language (L2) learners to improve the robustness of current neural machine translation (NMT) models on this type of data. Current large datasets used to train NMT systems are mostly Wikipedia or government documents written by highly competent speakers of that language, especially English. However, given that English is the most common second language, it is crucial that machine translation systems are robust against the large number of sentences written by L2 learners of English. By studying the difficulties faced by humans in their L2 acquisition process, we are able to transfer such insights to machine translation systems to recover from source-side fluency variations. In this work, we create additional training data with artificial errors similar to mistakes made by L2 learners of various fluency levels to improve the quality of the machine translation system. We test our method in zero-shot settings on the JFLEG-es (English-Spanish) dataset. The quality of our machine translation system on disfluent sentences outperforms the baseline by 1.8 BLEU scores.
Proceedings of the Eighth Conference on Machine Translation
Philipp Koehn | Barry Haddow | Tom Kocmi | Christof Monz
Proceedings of the Eighth Conference on Machine Translation
Philipp Koehn | Barry Haddow | Tom Kocmi | Christof Monz
Proceedings of the Eighth Conference on Machine Translation
Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet
Tom Kocmi | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Philipp Koehn | Benjamin Marie | Christof Monz | Makoto Morishita | Kenton Murray | Masaaki Nagata | Toshiaki Nakazawa | Martin Popel | Maja Popović | Mariya Shmatova | Jun Suzuki
Proceedings of the Eighth Conference on Machine Translation
Tom Kocmi | Eleftherios Avramidis | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Markus Freitag | Thamme Gowda | Roman Grundkiewicz | Barry Haddow | Philipp Koehn | Benjamin Marie | Christof Monz | Makoto Morishita | Kenton Murray | Masaaki Nagata | Toshiaki Nakazawa | Martin Popel | Maja Popović | Mariya Shmatova | Jun Suzuki
Proceedings of the Eighth Conference on Machine Translation
This paper presents the results of the General Machine Translation Task organised as part of the 2023 Conference on Machine Translation (WMT). In the general MT task, participants were asked to build machine translation systems for any of 8 language pairs (corresponding to 14 translation directions), to be evaluated on test sets consisting of up to four different domains. We evaluate system outputs with professional human annotators using a combination of source-based Direct Assessment and scalar quality metric (DA+SQM).
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Longyue Wang | Zhaopeng Tu | Yan Gu | Siyou Liu | Dian Yu | Qingsong Ma | Chenyang Lyu | Liting Zhou | Chao-Hong Liu | Yufeng Ma | Weiyu Chen | Yvette Graham | Bonnie Webber | Philipp Koehn | Andy Way | Yulin Yuan | Shuming Shi
Proceedings of the Eighth Conference on Machine Translation
Longyue Wang | Zhaopeng Tu | Yan Gu | Siyou Liu | Dian Yu | Qingsong Ma | Chenyang Lyu | Liting Zhou | Chao-Hong Liu | Yufeng Ma | Weiyu Chen | Yvette Graham | Bonnie Webber | Philipp Koehn | Andy Way | Yulin Yuan | Shuming Shi
Proceedings of the Eighth Conference on Machine Translation
Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 14 submissions from 7 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at http://www2.statmt.org/wmt23/literary-translation-task.html.
Findings of the WMT 2023 Shared Task on Parallel Data Curation
Steve Sloto | Brian Thompson | Huda Khayrallah | Tobias Domhan | Thamme Gowda | Philipp Koehn
Proceedings of the Eighth Conference on Machine Translation
Steve Sloto | Brian Thompson | Huda Khayrallah | Tobias Domhan | Thamme Gowda | Philipp Koehn
Proceedings of the Eighth Conference on Machine Translation
Building upon prior WMT shared tasks in document alignment and sentence filtering, we posed the open-ended shared task of finding the best subset of possible training data from a collection of Estonian-Lithuanian web data. Participants could focus on any portion of the end-to-end data curation pipeline, including alignment and filtering. We evaluated results based on downstream machine translation quality. We release processed Common Crawl data, along with various intermediate states from a strong baseline system, which we believe will enable future research on this topic.
Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA
Xuan Zhang | Navid Rajabi | Kevin Duh | Philipp Koehn
Proceedings of the Eighth Conference on Machine Translation
Xuan Zhang | Navid Rajabi | Kevin Duh | Philipp Koehn
Proceedings of the Eighth Conference on Machine Translation
While large language models have made remarkable advancements in natural language generation, their potential in machine translation, especially when fine-tuned, remains under-explored. In our study, we conduct comprehensive experiments, evaluating 15 publicly available language models on machine translation tasks. We compare the performance across three methodologies: zero-shot prompting, few-shot learning, and fine-tuning. Central to our approach is the use of QLoRA, an efficient fine-tuning method. On French-English, QLoRA fine-tuning outperforms both few-shot learning and models trained from scratch. This superiority is highlighted in both sentence-level and document-level translations, with a significant BLEU score improvement of 28.93 over the prompting method. Impressively, with QLoRA, the enhanced performance is achieved by fine-tuning a mere 0.77% of the model’s parameters.
Findings of the Word-Level AutoCompletion Shared Task in WMT 2023
Lemao Liu | Francisco Casacuberta | George Foster | Guoping Huang | Philipp Koehn | Geza Kovacs | Shuming Shi | Taro Watanabe | Chengqing Zong
Proceedings of the Eighth Conference on Machine Translation
Lemao Liu | Francisco Casacuberta | George Foster | Guoping Huang | Philipp Koehn | Geza Kovacs | Shuming Shi | Taro Watanabe | Chengqing Zong
Proceedings of the Eighth Conference on Machine Translation
This paper presents the overview of the second Word-Level autocompletion (WLAC) shared task for computer-aided translation, which aims to automatically complete a target word given a translation context including a human typed character sequence. We largely adhere to the settings of the previous round of the shared task, but with two main differences: 1) The typed character sequence is obtained from the typing process of human translators to demonstrate system performance under real-world scenarios when preparing some type of testing examples; 2) We conduct a thorough analysis on the results of the submitted systems from three perspectives. From the experimental results, we observe that translation tasks are helpful to improve the performance of WLAC models. Additionally, our further analysis shows that the semantic error accounts for a significant portion of all errors, and thus it would be promising to take this type of errors into account in future.
2022
Alternative Input Signals Ease Transfer in Multilingual Machine Translation
Simeng Sun | Angela Fan | James Cross | Vishrav Chaudhary | Chau Tran | Philipp Koehn | Francisco Guzmán
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Simeng Sun | Angela Fan | James Cross | Vishrav Chaudhary | Chau Tran | Philipp Koehn | Francisco Guzmán
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent work in multilingual machine translation (MMT) has focused on the potential of positive transfer between languages, particularly cases where higher-resourced languages can benefit lower-resourced ones. While training an MMT model, the supervision signals learned from one language pair can be transferred to the other via the tokens shared by multiple source languages. However, the transfer is inhibited when the token overlap among source languages is small, which manifests naturally when languages use different writing systems. In this paper, we tackle inhibited transfer by augmenting the training data with alternative signals that unify different writing systems, such as phonetic, romanized, and transliterated input. We test these signals on Indic and Turkic languages, two language families where the writing systems differ but languages still share common features. Our results indicate that a straightforward multi-source self-ensemble – training a model on a mixture of various signals and ensembling the outputs of the same model fed with different signals during inference, outperforms strong ensemble baselines by 1.3 BLEU points on both language families. Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible. Finally, our analysis demonstrates that including alternative signals yields more consistency and translates named entities more accurately, which is crucial for increased factuality of automated systems.
Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation
Weiting Tan | Shuoyang Ding | Huda Khayrallah | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Weiting Tan | Shuoyang Ding | Huda Khayrallah | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Neural Machine Translation (NMT) models are known to suffer from noisy inputs. To make models robust, we generate adversarial augmentation samples that attack the model and preserve the source-side meaning at the same time. To generate such samples, we propose a doubly-trained architecture that pairs two NMT models of opposite translation directions with a joint loss function, which combines the target-side attack and the source-side semantic similarity constraint. The results from our experiments across three different language pairs and two evaluation metrics show that these adversarial samples improve model robustness.
Embedding-Enhanced GIZA++: Improving Low-Resource Word Alignment Using Embeddings
Kelly Marchisio | Conghao Xiong | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Kelly Marchisio | Conghao Xiong | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
A popular natural language processing task decades ago, word alignment has been dominated until recently by GIZA++, a statistical method based on the 30-year-old IBM models. New methods that outperform GIZA++ primarily rely on large machine translation models, massively multilingual language models, or supervision from GIZA++ alignments itself. We introduce Embedding-Enhanced GIZA++, and outperform GIZA++ without any of the aforementioned factors. Taking advantage of monolingual embedding spaces of source and target language only, we exceed GIZA++’s performance in every tested scenario for three languages pairs. In the lowest-resource setting, we outperform GIZA++ by 8.5, 10.9, and 12 AER for RoEn, De-En, and En-Fr, respectively. We release our code at www.blind-review.code.
Consistent Human Evaluation of Machine Translation across Language Pairs
Daniel Licht | Cynthia Gao | Janice Lam | Francisco Guzman | Mona Diab | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Daniel Licht | Cynthia Gao | Janice Lam | Francisco Guzman | Mona Diab | Philipp Koehn
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Obtaining meaningful quality scores for machine translation systems through human evaluation remains a challenge given the high variability between human evaluators, partly due to subjective expectations for translation quality for different language pairs. We propose a new metric called XSTS that is more focused on semantic equivalence and a cross-lingual calibration method that enables more consistent assessment. We demonstrate the effectiveness of these novel contributions in large scale evaluation studies across up to 14 language pairs, with translation both into and out of English.
The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains
Haoran Xu | Philipp Koehn | Kenton Murray
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Haoran Xu | Philipp Koehn | Kenton Murray
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Recent model pruning methods have demonstrated the ability to remove redundant parameters without sacrificing model performance. Common methods remove redundant parameters according to the parameter sensitivity, a gradient-based measure reflecting the contribution of the parameters. In this paper, however, we argue that redundant parameters can be trained to make beneficial contributions. We first highlight the large sensitivity (contribution) gap among high-sensitivity and low-sensitivity parameters and show that the model generalization performance can be significantly improved after balancing the contribution of all parameters. Our goal is to balance the sensitivity of all parameters and encourage all of them to contribute equally. We propose a general task-agnostic method, namely intra-distillation, appended to the regular training loss to balance parameter sensitivity. Moreover, we also design a novel adaptive learning method to control the strength of intra-distillation loss for faster convergence. Our experiments show the strong effectiveness of our methods on machine translation, natural language understanding, and zero-shot cross-lingual transfer across up to 48 languages, e.g., a gain of 3.54 BLEU on average across 8 language pairs from the IWSLT’14 dataset.
Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport
Kelly Marchisio | Ali Saad-Eldin | Kevin Duh | Carey Priebe | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Kelly Marchisio | Ali Saad-Eldin | Kevin Duh | Carey Priebe | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval. In this work, we improve bilingual lexicon induction performance across 40 language pairs with a graph-matching method based on optimal transport. The method is especially strong with low amounts of supervision.
Toward the Limitation of Code-Switching in Cross-Lingual Transfer
Yukun Feng | Feng Li | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Yukun Feng | Feng Li | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Multilingual pretrained models have shown strong cross-lingual transfer ability. Some works used code-switching sentences, which consist of tokens from multiple languages, to enhance the cross-lingual representation further, and have shown success in many zero-shot cross-lingual tasks. However, code-switched tokens are likely to cause grammatical incoherence in newly substituted sentences, and negatively affect the performance on token-sensitive tasks, such as Part-of-Speech (POS) tagging and Named-Entity-Recognition (NER). This paper mitigates the limitation of the code-switching method by not only making the token replacement but considering the similarity between the context and the switched tokens so that the newly substituted sentences are grammatically consistent during both training and inference. We conduct experiments on cross-lingual POS and NER over 30+ languages, and demonstrate the effectiveness of our method by outperforming the mBERT by 0.95 and original code-switching method by 1.67 on F1 scores.
IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces
Kelly Marchisio | Neha Verma | Kevin Duh | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Kelly Marchisio | Neha Verma | Kevin Duh | Philipp Koehn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
The ability to extract high-quality translation dictionaries from monolingual word embedding spaces depends critically on the geometric similarity of the spaces—their degree of “isomorphism.” We address the root-cause of faulty cross-lingual mapping: that word embedding training resulted in the underlying spaces being non-isomorphic. We incorporate global measures of isomorphism directly into the skipgram loss function, successfully increasing the relative isomorphism of trained word embedding spaces and improving their ability to be mapped to a shared cross-lingual space. The result is improved bilingual lexicon induction in general data conditions, under domain mismatch, and with training algorithm dissimilarities. We release IsoVec at https://github.com/kellymarchisio/isovec.
Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation
Yukun Feng | Feng Li | Ziang Song | Boyuan Zheng | Philipp Koehn
Findings of the Association for Computational Linguistics: NAACL 2022
Yukun Feng | Feng Li | Ziang Song | Boyuan Zheng | Philipp Koehn
Findings of the Association for Computational Linguistics: NAACL 2022
The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average.
Data Selection Curriculum for Neural Machine Translation
Tasnim Mohiuddin | Philipp Koehn | Vishrav Chaudhary | James Cross | Shruti Bhosale | Shafiq Joty
Findings of the Association for Computational Linguistics: EMNLP 2022
Tasnim Mohiuddin | Philipp Koehn | Vishrav Chaudhary | James Cross | Shruti Bhosale | Shafiq Joty
Findings of the Association for Computational Linguistics: EMNLP 2022
Neural Machine Translation (NMT) models are typically trained on heterogeneous data that are concatenated and randomly shuffled. However, not all of the training data are equally useful to the model. Curriculum training aims to present the data to the NMT models in a meaningful order. In this work, we introduce a two-stage training framework for NMT where we fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring that considers prediction scores of the emerging NMT model. Through comprehensive experiments on six language pairs comprising low- and high-resource languages from WMT’21, we have shown that our curriculum strategies consistently demonstrate better quality (up to +2.2 BLEU improvement) and faster convergence (approximately 50% fewer updates).
Proceedings of the Seventh Conference on Machine Translation (WMT)
Philipp Koehn | Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Tom Kocmi | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri | Aurélie Névéol | Mariana Neves | Martin Popel | Marco Turchi | Marcos Zampieri
Proceedings of the Seventh Conference on Machine Translation (WMT)
Philipp Koehn | Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Tom Kocmi | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri | Aurélie Névéol | Mariana Neves | Martin Popel | Marco Turchi | Marcos Zampieri
Proceedings of the Seventh Conference on Machine Translation (WMT)
Findings of the 2022 Conference on Machine Translation (WMT22)
Tom Kocmi | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Thamme Gowda | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Rebecca Knowles | Philipp Koehn | Christof Monz | Makoto Morishita | Masaaki Nagata | Toshiaki Nakazawa | Michal Novák | Martin Popel | Maja Popović
Proceedings of the Seventh Conference on Machine Translation (WMT)
Tom Kocmi | Rachel Bawden | Ondřej Bojar | Anton Dvorkovich | Christian Federmann | Mark Fishel | Thamme Gowda | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Rebecca Knowles | Philipp Koehn | Christof Monz | Makoto Morishita | Masaaki Nagata | Toshiaki Nakazawa | Michal Novák | Martin Popel | Maja Popović
Proceedings of the Seventh Conference on Machine Translation (WMT)
This paper presents the results of the General Machine Translation Task organised as part of the Conference on Machine Translation (WMT) 2022. In the general MT task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting of four different domains. We evaluate system outputs with human annotators using two different techniques: reference-based direct assessment and (DA) and a combination of DA and scalar quality metric (DA+SQM).
Findings of the Word-Level AutoCompletion Shared Task in WMT 2022
Francisco Casacuberta | George Foster | Guoping Huang | Philipp Koehn | Geza Kovacs | Lemao Liu | Shuming Shi | Taro Watanabe | Chengqing Zong
Proceedings of the Seventh Conference on Machine Translation (WMT)
Francisco Casacuberta | George Foster | Guoping Huang | Philipp Koehn | Geza Kovacs | Lemao Liu | Shuming Shi | Taro Watanabe | Chengqing Zong
Proceedings of the Seventh Conference on Machine Translation (WMT)
Recent years have witnessed rapid advancements in machine translation, but the state-of-the-art machine translation system still can not satisfy the high requirements in some rigorous translation scenarios. Computer-aided translation (CAT) provides a promising solution to yield a high-quality translation with a guarantee. Unfortunately, due to the lack of popular benchmarks, the research on CAT is not well developed compared with machine translation. In this year, we hold a new shared task called Word-level AutoCompletion (WLAC) for CAT in WMT. Specifically, we introduce some resources to train a WLAC model, and particularly we collect data from CAT systems as a part of test data for this shared task. In addition, we employ both automatic and human evaluations to measure the performance of the submitted systems, and our final evaluation results reveal some findings for the WLAC task.
2021
Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data
Wei-Jen Ko | Ahmed El-Kishky | Adithya Renduchintala | Vishrav Chaudhary | Naman Goyal | Francisco Guzmán | Pascale Fung | Philipp Koehn | Mona Diab
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Wei-Jen Ko | Ahmed El-Kishky | Adithya Renduchintala | Vishrav Chaudhary | Naman Goyal | Francisco Guzmán | Pascale Fung | Philipp Koehn | Mona Diab
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
The scarcity of parallel data is a major obstacle for training high-quality machine translation systems for low-resource languages. Fortunately, some low-resource languages are linguistically related or similar to high-resource languages; these related languages may share many lexical or syntactic structures. In this work, we exploit this linguistic overlap to facilitate translating to and from a low-resource language with only monolingual data, in addition to any parallel data in the related high-resource language. Our method, NMT-Adapt, combines denoising autoencoding, back-translation and adversarial objectives to utilize monolingual data for low-resource adaptation. We experiment on 7 languages from three different language families and show that our technique significantly improves translation into low-resource language compared to other translation baselines.
Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding Transformation
Haoran Xu | Philipp Koehn
Proceedings of the Second Workshop on Domain Adaptation for NLP
Haoran Xu | Philipp Koehn
Proceedings of the Second Workshop on Domain Adaptation for NLP
Linear embedding transformation has been shown to be effective for zero-shot cross-lingual transfer tasks and achieve surprisingly promising results. However, cross-lingual embedding space mapping is usually studied in static word-level embeddings, where a space transformation is derived by aligning representations of translation pairs that are referred from dictionaries. We move further from this line and investigate a contextual embedding alignment approach which is sense-level and dictionary-free. To enhance the quality of the mapping, we also provide a deep view of properties of contextual embeddings, i.e., the anisotropy problem and its solution. Experiments on zero-shot dependency parsing through the concept-shared space built by our embedding transformation substantially outperform state-of-the-art methods using multilingual embeddings.
Levenshtein Training for Word-level Quality Estimation
Shuoyang Ding | Marcin Junczys-Dowmunt | Matt Post | Philipp Koehn
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Shuoyang Ding | Marcin Junczys-Dowmunt | Matt Post | Philipp Koehn
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation. A Levenshtein Transformer is a natural fit for this task: trained to perform decoding in an iterative manner, a Levenshtein Transformer can learn to post-edit without explicit supervision. To further minimize the mismatch between the translation task and the word-level QE task, we propose a two-stage transfer learning procedure on both augmented data and human post-editing data. We also propose heuristics to construct reference labels that are compatible with subword-level finetuning and inference. Results on WMT 2020 QE shared task dataset show that our proposed method has superior data efficiency under the data-constrained setting and competitive performance under the unconstrained setting.
XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment
Ahmed El-Kishky | Adithya Renduchintala | James Cross | Francisco Guzmán | Philipp Koehn
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Ahmed El-Kishky | Adithya Renduchintala | James Cross | Francisco Guzmán | Philipp Koehn
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Cross-lingual named-entity lexica are an important resource to multilingual NLP tasks such as machine translation and cross-lingual wikification. While knowledge bases contain a large number of entities in high-resource languages such as English and French, corresponding entities for lower-resource languages are often missing. To address this, we propose Lexical-Semantic-Phonetic Align (LSP-Align), a technique to automatically mine cross-lingual entity lexica from mined web data. We demonstrate LSP-Align outperforms baselines at extracting cross-lingual entity pairs and mine 164 million entity pairs from 120 different languages aligned with English. We release these cross-lingual entity pairs along with the massively multilingual tagged named entity corpus as a resource to the NLP community.
An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces
Kelly Marchisio | Youngser Park | Ali Saad-Eldin | Anton Alyakin | Kevin Duh | Carey Priebe | Philipp Koehn
Findings of the Association for Computational Linguistics: EMNLP 2021
Kelly Marchisio | Youngser Park | Ali Saad-Eldin | Anton Alyakin | Kevin Duh | Carey Priebe | Philipp Koehn
Findings of the Association for Computational Linguistics: EMNLP 2021
Much recent work in bilingual lexicon induction (BLI) views word embeddings as vectors in Euclidean space. As such, BLI is typically solved by finding a linear transformation that maps embeddings to a common space. Alternatively, word embeddings may be understood as nodes in a weighted graph. This framing allows us to examine a node’s graph neighborhood without assuming a linear transform, and exploits new techniques from the graph matching optimization literature. These contrasting approaches have not been compared in BLI so far. In this work, we study the behavior of Euclidean versus graph-based approaches to BLI under differing data conditions and show that they complement each other when combined. We release our code at https://github.com/kellymarchisio/euc-v-graph-bli.
Learning Curricula for Multilingual Neural Machine Translation Training
Gaurav Kumar | Philipp Koehn | Sanjeev Khudanpur
Proceedings of Machine Translation Summit XVIII: Research Track
Gaurav Kumar | Philipp Koehn | Sanjeev Khudanpur
Proceedings of Machine Translation Summit XVIII: Research Track
Low-resource Multilingual Neural Machine Translation (MNMT) is typically tasked with improving the translation performance on one or more language pairs with the aid of high-resource language pairs. In this paper and we propose two simple search based curricula – orderings of the multilingual training data – which help improve translation performance in conjunction with existing techniques such as fine-tuning. Additionally and we attempt to learn a curriculum for MNMT from scratch jointly with the training of the translation system using contextual multi-arm bandits. We show on the FLORES low-resource translation dataset that these learned curricula can provide better starting points for fine tuning and improve overall performance of the translation system.
An Alignment-Based Approach to Semi-Supervised Bilingual Lexicon Induction with Small Parallel Corpora
Kelly Marchisio | Philipp Koehn | Conghao Xiong
Proceedings of Machine Translation Summit XVIII: Research Track
Kelly Marchisio | Philipp Koehn | Conghao Xiong
Proceedings of Machine Translation Summit XVIII: Research Track
Aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently. While interesting and fully unsupervised settings are unrealistic; small amounts of bilingual data are usually available due to the existence of massively multilingual parallel corpora and or linguists can create small amounts of parallel data. In this work and we demonstrate an effective bootstrapping approach for semi-supervised bilingual lexicon induction that capitalizes upon the complementary strengths of two disparate methods for inducing bilingual lexicons. Whereas statistical methods are highly effective at inducing correct translation pairs for words frequently occurring in a parallel corpus and monolingual embedding spaces have the advantage of having been trained on large amounts of data and and therefore may induce accurate translations for words absent from the small corpus. By combining these relative strengths and our method achieves state-of-the-art results on 3 of 4 language pairs in the challenging VecMap test set using minimal amounts of parallel data and without the need for a translation dictionary. We release our implementation at www.blind-review.code.
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding | Philipp Koehn
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Shuoyang Ding | Philipp Koehn
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model. In these cases, how do we identify when are these interpretations trustworthy enough to be used in analyses? To address this question, we conduct a comprehensive and quantitative evaluation of saliency methods on a fundamental category of NLP models: neural language models. We evaluate the quality of prediction interpretations from two perspectives that each represents a desirable property of these interpretations: plausibility and faithfulness. Our evaluation is conducted on four different datasets constructed from the existing human annotation of syntactic and semantic agreements, on both sentence-level and document-level. Through our evaluation, we identified various ways saliency methods could yield interpretations of low quality. We recommend that future work deploying such methods to neural language models should carefully validate their interpretations before drawing insights.
Proceedings of the Sixth Conference on Machine Translation
Loic Barrault | Ondrej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussa | Christian Federmann | Mark Fishel | Alexander Fraser | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Tom Kocmi | Andre Martins | Makoto Morishita | Christof Monz
Proceedings of the Sixth Conference on Machine Translation
Loic Barrault | Ondrej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussa | Christian Federmann | Mark Fishel | Alexander Fraser | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Tom Kocmi | Andre Martins | Makoto Morishita | Christof Monz
Proceedings of the Sixth Conference on Machine Translation
Findings of the 2021 Conference on Machine Translation (WMT21)
Farhad Akhbardeh | Arkady Arkhangorodsky | Magdalena Biesialska | Ondřej Bojar | Rajen Chatterjee | Vishrav Chaudhary | Marta R. Costa-jussa | Cristina España-Bonet | Angela Fan | Christian Federmann | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Leonie Harter | Kenneth Heafield | Christopher Homan | Matthias Huck | Kwabena Amponsah-Kaakyire | Jungo Kasai | Daniel Khashabi | Kevin Knight | Tom Kocmi | Philipp Koehn | Nicholas Lourie | Christof Monz | Makoto Morishita | Masaaki Nagata | Ajay Nagesh | Toshiaki Nakazawa | Matteo Negri | Santanu Pal | Allahsera Auguste Tapo | Marco Turchi | Valentin Vydrin | Marcos Zampieri
Proceedings of the Sixth Conference on Machine Translation
Farhad Akhbardeh | Arkady Arkhangorodsky | Magdalena Biesialska | Ondřej Bojar | Rajen Chatterjee | Vishrav Chaudhary | Marta R. Costa-jussa | Cristina España-Bonet | Angela Fan | Christian Federmann | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Leonie Harter | Kenneth Heafield | Christopher Homan | Matthias Huck | Kwabena Amponsah-Kaakyire | Jungo Kasai | Daniel Khashabi | Kevin Knight | Tom Kocmi | Philipp Koehn | Nicholas Lourie | Christof Monz | Makoto Morishita | Masaaki Nagata | Ajay Nagesh | Toshiaki Nakazawa | Matteo Negri | Santanu Pal | Allahsera Auguste Tapo | Marco Turchi | Valentin Vydrin | Marcos Zampieri
Proceedings of the Sixth Conference on Machine Translation
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021.In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories. The taskwas also opened up to additional test suites toprobe specific aspects of translation.
Facebook AI’s WMT21 News Translation Task Submission
Chau Tran | Shruti Bhosale | James Cross | Philipp Koehn | Sergey Edunov | Angela Fan
Proceedings of the Sixth Conference on Machine Translation
Chau Tran | Shruti Bhosale | James Cross | Philipp Koehn | Sergey Edunov | Angela Fan
Proceedings of the Sixth Conference on Machine Translation
We describe Facebook’s multilingual model submission to the WMT2021 shared task on news translation. We participate in 14 language directions: English to and from Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese. To develop systems covering all these directions, we focus on multilingual models. We utilize data from all available sources — WMT, large-scale data mining, and in-domain backtranslation — to create high quality bilingual and multilingual baselines. Subsequently, we investigate strategies for scaling multilingual model size, such that one system has sufficient capacity for high quality representations of all eight languages. Our final submission is an ensemble of dense and sparse Mixture-of-Expert multilingual translation models, followed by finetuning on in-domain news data and noisy channel reranking. Compared to previous year’s winning submissions, our multilingual system improved the translation quality on all language directions, with an average improvement of 2.0 BLEU. In the WMT2021 task, our system ranks first in 10 directions based on automatic evaluation.
Findings of the WMT Shared Task on Machine Translation Using Terminologies
Md Mahfuz Ibn Alam | Ivana Kvapilíková | Antonios Anastasopoulos | Laurent Besacier | Georgiana Dinu | Marcello Federico | Matthias Gallé | Kweonwoo Jung | Philipp Koehn | Vassilina Nikoulina
Proceedings of the Sixth Conference on Machine Translation
Md Mahfuz Ibn Alam | Ivana Kvapilíková | Antonios Anastasopoulos | Laurent Besacier | Georgiana Dinu | Marcello Federico | Matthias Gallé | Kweonwoo Jung | Philipp Koehn | Vassilina Nikoulina
Proceedings of the Sixth Conference on Machine Translation
Language domains that require very careful use of terminology are abundant and reflect a significant part of the translation industry. In this work we introduce a benchmark for evaluating the quality and consistency of terminology translation, focusing on the medical (and COVID-19 specifically) domain for five language pairs: English to French, Chinese, Russian, and Korean, as well as Czech to German. We report the descriptions and results of the participating systems, commenting on the need for further research efforts towards both more adequate handling of terminologies as well as towards a proper formulation and evaluation of the task.
The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task
Shuoyang Ding | Marcin Junczys-Dowmunt | Matt Post | Christian Federmann | Philipp Koehn
Proceedings of the Sixth Conference on Machine Translation
Shuoyang Ding | Marcin Junczys-Dowmunt | Matt Post | Christian Federmann | Philipp Koehn
Proceedings of the Sixth Conference on Machine Translation
This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task. We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation. The techniques we experimented with include Levenshtein Transformer training and data augmentation with a combination of forward, backward, round-trip translation, and pseudo post-editing of the MT output. We demonstrate the competitiveness of our system compared to the widely adopted OpenKiwi-XLM baseline. Our system is also the top-ranking system on the MT MCC metric for the English-German language pair.
Learning Feature Weights using Reward Modeling for Denoising Parallel Corpora
Gaurav Kumar | Philipp Koehn | Sanjeev Khudanpur
Proceedings of the Sixth Conference on Machine Translation
Gaurav Kumar | Philipp Koehn | Sanjeev Khudanpur
Proceedings of the Sixth Conference on Machine Translation
Large web-crawled corpora represent an excellent resource for improving the performance of Neural Machine Translation (NMT) systems across several language pairs. However, since these corpora are typically extremely noisy, their use is fairly limited. Current approaches to deal with this problem mainly focus on filtering using heuristics or single features such as language model scores or bi-lingual similarity. This work presents an alternative approach which learns weights for multiple sentence-level features. These feature weights which are optimized directly for the task of improving translation performance, are used to score and filter sentences in the noisy corpora more effectively. We provide results of applying this technique to building NMT systems using the Paracrawl corpus for Estonian-English and show that it beats strong single feature baselines and hand designed combinations. Additionally, we analyze the sensitivity of this method to different types of noise and explore if the learned weights generalize to other language pairs using the Maltese-English Paracrawl corpus.
2020
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
Xutai Ma | Juan Pino | Philipp Koehn
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Xutai Ma | Juan Pino | Philipp Koehn
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
We investigate how to adapt simultaneous text translation methods such as wait-k and monotonic multihead attention to end-to-end simultaneous speech translation by introducing a pre-decision module. A detailed analysis is provided on the latency-quality trade-offs of combining fixed and flexible pre-decision with fixed and flexible policies. We also design a novel computation-aware latency metric, adapted from Average Lagging.
ParaCrawl: Web-Scale Acquisition of Parallel Corpora
Marta Bañón | Pinzhen Chen | Barry Haddow | Kenneth Heafield | Hieu Hoang | Miquel Esplà-Gomis | Mikel L. Forcada | Amir Kamran | Faheem Kirefu | Philipp Koehn | Sergio Ortiz Rojas | Leopoldo Pla Sempere | Gema Ramírez-Sánchez | Elsa Sarrías | Marek Strelec | Brian Thompson | William Waites | Dion Wiggins | Jaume Zaragoza
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Marta Bañón | Pinzhen Chen | Barry Haddow | Kenneth Heafield | Hieu Hoang | Miquel Esplà-Gomis | Mikel L. Forcada | Amir Kamran | Faheem Kirefu | Philipp Koehn | Sergio Ortiz Rojas | Leopoldo Pla Sempere | Gema Ramírez-Sánchez | Elsa Sarrías | Marek Strelec | Brian Thompson | William Waites | Dion Wiggins | Jaume Zaragoza
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software. We empirically compare alternative methods and publish benchmark data sets for sentence alignment and sentence pair filtering. We also describe the parallel corpora released and evaluate their quality and their usefulness to create machine translation systems.
A Survey of Qualitative Error Analysis for Neural Machine Translation Systems
Denise Díaz | James Cross | Vishrav Chaudhary | Ahmed Kishky | Philipp Koehn
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)
Denise Díaz | James Cross | Vishrav Chaudhary | Ahmed Kishky | Philipp Koehn
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)
Statistical Power and Translationese in Machine Translation Evaluation
Yvette Graham | Barry Haddow | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Yvette Graham | Barry Haddow | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
The term translationese has been used to describe features of translated text, and in this paper, we provide detailed analysis of potential adverse effects of translationese on machine translation evaluation. Our analysis shows differences in conclusions drawn from evaluations that include translationese in test data compared to experiments that tested only with text originally composed in that language. For this reason we recommend that reverse-created test data be omitted from future machine translation test sets. In addition, we provide a re-evaluation of a past machine translation evaluation claiming human-parity of MT. One important issue not previously considered is statistical power of significance tests applied to comparison of human and machine translation. Since the very aim of past evaluations was investigation of ties between human and MT systems, power analysis is of particular importance, to avoid, for example, claims of human parity simply corresponding to Type II error resulting from the application of a low powered test. We provide detailed analysis of tests used in such evaluations to provide an indication of a suitable minimum sample size for future studies.
Simulated multiple reference training improves low-resource machine translation
Huda Khayrallah | Brian Thompson | Matt Post | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Huda Khayrallah | Brian Thompson | Matt Post | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Many valid translations exist for a given sentence, yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings. We introduce Simulated Multiple Reference Training (SMRT), a novel MT training method that approximates the full space of possible translations by sampling a paraphrase of the reference sentence from a paraphraser and training the MT model to predict the paraphraser’s distribution over possible tokens. We demonstrate the effectiveness of SMRT in low-resource settings when translating to English, with improvements of 1.2 to 7.0 BLEU. We also find SMRT is complementary to back-translation.
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
Ahmed El-Kishky | Vishrav Chaudhary | Francisco Guzmán | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Ahmed El-Kishky | Vishrav Chaudhary | Francisco Guzmán | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Cross-lingual document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other. In this paper, we exploit the signals embedded in URLs to label web documents at scale with an average precision of 94.5% across different language pairs. We mine sixty-eight snapshots of the Common Crawl corpus and identify web document pairs that are translations of each other. We release a new web dataset consisting of over 392 million URL pairs from Common Crawl covering documents in 8144 language pairs of which 137 pairs include English. In addition to curating this massive dataset, we introduce baseline methods that leverage cross-lingual representations to identify aligned documents based on their textual content. Finally, we demonstrate the value of this parallel documents dataset through a downstream task of mining parallel sentences and measuring the quality of machine translations from models trained on this mined data. Our objective in releasing this dataset is to foster new research in cross-lingual NLP across a variety of low, medium, and high-resource languages.
Exploiting Sentence Order in Document Alignment
Brian Thompson | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Brian Thompson | Philipp Koehn
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
We present a simple document alignment method that incorporates sentence order information in both candidate generation and candidate re-scoring. Our method results in 61% relative reduction in error compared to the best previously published result on the WMT16 document alignment shared task. Our method improves downstream MT performance on web-scraped Sinhala–English documents from ParaCrawl, outperforming the document alignment method used in the most recent ParaCrawl release. It also outperforms a comparable corpora method which uses the same multilingual embeddings, demonstrating that exploiting sentence order is beneficial even if the end goal is sentence-level bitext.
TICO-19: the Translation Initiative for COvid-19
Antonios Anastasopoulos | Alessandro Cattelan | Zi-Yi Dou | Marcello Federico | Christian Federmann | Dmitriy Genzel | Francisco Guzmán | Junjie Hu | Macduff Hughes | Philipp Koehn | Rosie Lazar | Will Lewis | Graham Neubig | Mengmeng Niu | Alp Öktem | Eric Paquin | Grace Tang | Sylwia Tur
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Antonios Anastasopoulos | Alessandro Cattelan | Zi-Yi Dou | Marcello Federico | Christian Federmann | Dmitriy Genzel | Francisco Guzmán | Junjie Hu | Macduff Hughes | Philipp Koehn | Rosie Lazar | Will Lewis | Graham Neubig | Mengmeng Niu | Alp Öktem | Eric Paquin | Grace Tang | Sylwia Tur
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
The COVID-19 pandemic is the worst pandemic to strike the world in over a century. Crucial to stemming the tide of the SARS-CoV-2 virus is communicating to vulnerable populations the means by which they can protect themselves. To this end, the collaborators forming the Translation Initiative for COvid-19 (TICO-19) have made test and development data available to AI and MT researchers in 35 different languages in order to foster the development of tools and resources for improving access to information about COVID-19 in these languages. In addition to 9 high-resourced, ”pivot” languages, the team is targeting 26 lesser resourced languages, in particular languages of Africa, South Asia and South-East Asia, whose populations may be the most vulnerable to the spread of the virus. The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set. Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.
Proceedings of the Fifth Conference on Machine Translation
Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Yvette Graham | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri
Proceedings of the Fifth Conference on Machine Translation
Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Yvette Graham | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri
Proceedings of the Fifth Conference on Machine Translation
Findings of the 2020 Conference on Machine Translation (WMT20)
Loïc Barrault | Magdalena Biesialska | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Matthias Huck | Eric Joanis | Tom Kocmi | Philipp Koehn | Chi-kiu Lo | Nikola Ljubešić | Christof Monz | Makoto Morishita | Masaaki Nagata | Toshiaki Nakazawa | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fifth Conference on Machine Translation
Loïc Barrault | Magdalena Biesialska | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Yvette Graham | Roman Grundkiewicz | Barry Haddow | Matthias Huck | Eric Joanis | Tom Kocmi | Philipp Koehn | Chi-kiu Lo | Nikola Ljubešić | Christof Monz | Makoto Morishita | Masaaki Nagata | Toshiaki Nakazawa | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fifth Conference on Machine Translation
This paper presents the results of the news translation task and the similar language translation task, both organised alongside the Conference on Machine Translation (WMT) 2020. In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories. The task was also opened up to additional test suites to probe specific aspects of translation. In the similar language translation task, participants built machine translation systems for translating between closely related pairs of languages.
Findings of the WMT 2020 Shared Task on Machine Translation Robustness
Lucia Specia | Zhenhao Li | Juan Pino | Vishrav Chaudhary | Francisco Guzmán | Graham Neubig | Nadir Durrani | Yonatan Belinkov | Philipp Koehn | Hassan Sajjad | Paul Michel | Xian Li
Proceedings of the Fifth Conference on Machine Translation
Lucia Specia | Zhenhao Li | Juan Pino | Vishrav Chaudhary | Francisco Guzmán | Graham Neubig | Nadir Durrani | Yonatan Belinkov | Philipp Koehn | Hassan Sajjad | Paul Michel | Xian Li
Proceedings of the Fifth Conference on Machine Translation
We report the findings of the second edition of the shared task on improving robustness in Machine Translation (MT). The task aims to test current machine translation systems in their ability to handle challenges facing MT models to be deployed in the real world, including domain diversity and non-standard texts common in user generated content, especially in social media. We cover two language pairs – English-German and English-Japanese and provide test sets in zero-shot and few-shot variants. Participating systems are evaluated both automatically and manually, with an additional human evaluation for ”catastrophic errors”. We received 59 submissions by 11 participating teams from a variety of types of institutions.
When Does Unsupervised Machine Translation Work?
Kelly Marchisio | Kevin Duh | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
Kelly Marchisio | Kevin Duh | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
Despite the reported success of unsupervised machine translation (MT), the field has yet to examine the conditions under which the methods succeed and fail. We conduct an extensive empirical evaluation using dissimilar language pairs, dissimilar domains, and diverse datasets. We find that performance rapidly deteriorates when source and target corpora are from different domains, and that stochasticity during embedding training can dramatically affect downstream results. We additionally find that unsupervised MT performance declines when source and target languages use different scripts, and observe very poor performance on authentic low-resource language pairs. We advocate for extensive empirical evaluation of unsupervised MT systems to highlight failure points and encourage continued research on the most promising paradigms. We release our preprocessed dataset to encourage evaluations that stress-test systems under multiple data conditions.
Findings of the WMT 2020 Shared Task on Parallel Corpus Filtering and Alignment
Philipp Koehn | Vishrav Chaudhary | Ahmed El-Kishky | Naman Goyal | Peng-Jen Chen | Francisco Guzmán
Proceedings of the Fifth Conference on Machine Translation
Philipp Koehn | Vishrav Chaudhary | Ahmed El-Kishky | Naman Goyal | Peng-Jen Chen | Francisco Guzmán
Proceedings of the Fifth Conference on Machine Translation
Following two preceding WMT Shared Task on Parallel Corpus Filtering (Koehn et al., 2018, 2019), we posed again the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting the highest-quality data to be used to train ma-chine translation systems. This year, the task tackled the low resource condition of Pashto–English and Khmer–English and also included the challenge of sentence alignment from document pairs.
An exploratory approach to the Parallel Corpus Filtering shared task WMT20
Ankur Kejriwal | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
Ankur Kejriwal | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
In this document we describe our submission to the parallel corpus filtering task using multilingual word embedding, language models and an ensemble of pre and post filtering rules. We use the norms of embedding and the perplexities of language models along with pre/post filtering rules to complement the LASER baseline scores and in the end get an improvement on the dev set in both language pairs.
Dual Conditional Cross Entropy Scores and LASER Similarity Scores for the WMT20 Parallel Corpus Filtering Shared Task
Felicia Koerner | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
Felicia Koerner | Philipp Koehn
Proceedings of the Fifth Conference on Machine Translation
This paper describes our submission to the WMT20 Parallel Corpus Filtering and Alignment for Low-Resource Conditions Shared Task. This year’s corpora are noisy Khmer-English and Pashto-English, with 58.3 million and 11.6 million words respectively (English token count). Our submission focuses on filtering Pashto-English, building on previously successful methods to produce two sets of scores: LASER_LM, a combination of the LASER similarity scores provided in the shared task and perplexity scores from language models, and DCCEF_DUP, dual conditional cross entropy scores combined with a duplication penalty. We improve slightly on the LASER similarity score and find that the provided clean data can successfully be supplemented with a subsampled set of the noisy data, effectively increasing the training data for the models used for dual conditional cross entropy scoring.
2019
Vecalign: Improved Sentence Alignment in Linear Time and Space
Brian Thompson | Philipp Koehn
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Brian Thompson | Philipp Koehn
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
We introduce Vecalign, a novel bilingual sentence alignment method which is linear in time and space with respect to the number of sentences being aligned and which requires only bilingual sentence embeddings. On a standard German–French test set, Vecalign outperforms the previous state-of-the-art method (which has quadratic time complexity and requires a machine translation system) by 5 F1 points. It substantially outperforms the popular Hunalign toolkit at recovering Bible verse alignments in medium- to low-resource language pairs, and it improves downstream MT quality by 1.7 and 1.6 BLEU in Sinhala-English and Nepali-English, respectively, compared to the Hunalign-based Paracrawl pipeline.
HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation
Brian Thompson | Rebecca Knowles | Xuan Zhang | Huda Khayrallah | Kevin Duh | Philipp Koehn
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Brian Thompson | Rebecca Knowles | Xuan Zhang | Huda Khayrallah | Kevin Duh | Philipp Koehn
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Bilingual lexicons are valuable resources used by professional human translators. While these resources can be easily incorporated in statistical machine translation, it is unclear how to best do so in the neural framework. In this work, we present the HABLex dataset, designed to test methods for bilingual lexicon integration into neural machine translation. Our data consists of human generated alignments of words and phrases in machine translation test sets in three language pairs (Russian-English, Chinese-English, and Korean-English), resulting in clean bilingual lexicons which are well matched to the reference. We also present two simple baselines - constrained decoding and continued training - and an improvement to continued training to address overfitting.
The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English
Francisco Guzmán | Peng-Jen Chen | Myle Ott | Juan Pino | Guillaume Lample | Philipp Koehn | Vishrav Chaudhary | Marc’Aurelio Ranzato
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Francisco Guzmán | Peng-Jen Chen | Myle Ott | Juan Pino | Guillaume Lample | Philipp Koehn | Vishrav Chaudhary | Marc’Aurelio Ranzato
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available. Besides the technical challenges of learning with limited supervision, it is difficult to evaluate methods trained on low-resource language pairs because of the lack of freely and publicly available benchmarks. In this work, we introduce the FLORES evaluation datasets for Nepali–English and Sinhala– English, based on sentences translated from Wikipedia. Compared to English, these are languages with very different morphology and syntax, for which little out-of-domain parallel data is available and for which relatively large amounts of monolingual data are freely available. We describe our process to collect and cross-check the quality of translations, and we report baseline performance using several learning settings: fully supervised, weakly supervised, semi-supervised, and fully unsupervised. Our experiments demonstrate that current state-of-the-art methods perform rather poorly on this benchmark, posing a challenge to the research community working on low-resource MT. Data and code to reproduce our experiments are available at https://github.com/facebookresearch/flores.
Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
We present a machine foreign-language teacher that modifies text in a student’s native language (L1) by replacing some word tokens with glosses in a foreign language (L2), in such a way that the student can acquire L2 vocabulary simply by reading the resulting macaronic text. The machine teacher uses no supervised data from human students. Instead, to guide the machine teacher’s choice of which words to replace, we equip a cloze language model with a training procedure that can incrementally learn representations for novel words, and use this model as a proxy for the word guessing and learning ability of real human students. We use Mechanical Turk to evaluate two variants of the student model: (i) one that generates a representation for a novel word using only surrounding context and (ii) an extension that also uses the spelling of the novel word.
Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation
Brian Thompson | Jeremy Gwinnup | Huda Khayrallah | Kevin Duh | Philipp Koehn
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Brian Thompson | Jeremy Gwinnup | Huda Khayrallah | Kevin Duh | Philipp Koehn
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Continued training is an effective method for domain adaptation in neural machine translation. However, in-domain gains from adaptation come at the expense of general-domain performance. In this work, we interpret the drop in general-domain performance as catastrophic forgetting of general-domain knowledge. To mitigate it, we adapt Elastic Weight Consolidation (EWC)—a machine learning method for learning a new task without forgetting previous tasks. Our method retains the majority of general-domain performance lost in continued training without degrading in-domain performance, outperforming the previous state-of-the-art. We also explore the full range of general-domain performance available when some in-domain degradation is acceptable.
De-Mixing Sentiment from Code-Mixed Text
Yash Kumar Lal | Vaibhav Kumar | Mrinal Dhar | Manish Shrivastava | Philipp Koehn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Yash Kumar Lal | Vaibhav Kumar | Mrinal Dhar | Manish Shrivastava | Philipp Koehn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Code-mixing is the phenomenon of mixing the vocabulary and syntax of multiple languages in the same sentence. It is an increasingly common occurrence in today’s multilingual society and poses a big challenge when encountered in different downstream tasks. In this paper, we present a hybrid architecture for the task of Sentiment Analysis of English-Hindi code-mixed data. Our method consists of three components, each seeking to alleviate different issues. We first generate subword level representations for the sentences using a CNN architecture. The generated representations are used as inputs to a Dual Encoder Network which consists of two different BiLSTMs - the Collective and Specific Encoder. The Collective Encoder captures the overall sentiment of the sentence, while the Specific Encoder utilizes an attention mechanism in order to focus on individual sentiment-bearing sub-words. This, combined with a Feature Network consisting of orthographic features and specially trained word embeddings, achieves state-of-the-art results - 83.54% accuracy and 0.827 F1 score - on a benchmark dataset.
Parallelizable Stack Long Short-Term Memory
Shuoyang Ding | Philipp Koehn
Proceedings of the Third Workshop on Structured Prediction for NLP
Shuoyang Ding | Philipp Koehn
Proceedings of the Third Workshop on Structured Prediction for NLP
Stack Long Short-Term Memory (StackLSTM) is useful for various applications such as parsing and string-to-tree neural machine translation, but it is also known to be notoriously difficult to parallelize for GPU training due to the fact that the computations are dependent on discrete operations. In this paper, we tackle this problem by utilizing state access patterns of StackLSTM to homogenize computations with regard to different discrete operations. Our parsing experiments show that the method scales up almost linearly with increasing batch size, and our parallelized PyTorch implementation trains significantly faster compared to the Dynet C++ implementation.
Simple Construction of Mixed-Language Texts for Vocabulary Learning
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
We present a machine foreign-language teacher that takes documents written in a student’s native language and detects situations where it can replace words with their foreign glosses such that new foreign vocabulary can be learned simply through reading the resulting mixed-language text. We show that it is possible to design such a machine teacher without any supervised data from (human) students. We accomplish this by modifying a cloze language model to incrementally learn new vocabulary items, and use this language model as a proxy for the word guessing and learning ability of real students. Our machine foreign-language teacher decides which subset of words to replace by consulting this language model. We evaluate three variants of our student proxy language models through a study on Amazon Mechanical Turk (MTurk). We find that MTurk “students” were able to guess the meanings of foreign words introduced by the machine teacher with high accuracy for both function words as well as content words in two out of the three models. In addition, we show that students are able to retain their knowledge about the foreign words after they finish reading the document.
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Saliency-driven Word Alignment Interpretation for Neural Machine Translation
Shuoyang Ding | Hainan Xu | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Shuoyang Ding | Hainan Xu | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments. In this paper, we show that NMT models do learn interpretable word alignments, which could only be revealed with proper interpretation methods. We propose a series of such methods that are model-agnostic, are able to be applied either offline or online, and do not require parameter update or architectural change. We show that under the force decoding setup, the alignments induced by our interpretation method are of better quality than fast-align for some systems, and when performing free decoding, they agree well with the alignments induced by automatic alignment tools.
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Findings of the 2019 Conference on Machine Translation (WMT19)
Loïc Barrault | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Shervin Malmasi | Christof Monz | Mathias Müller | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Loïc Barrault | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Shervin Malmasi | Christof Monz | Mathias Müller | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.
Findings of the First Shared Task on Machine Translation Robustness
Xian Li | Paul Michel | Antonios Anastasopoulos | Yonatan Belinkov | Nadir Durrani | Orhan Firat | Philipp Koehn | Graham Neubig | Juan Pino | Hassan Sajjad
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Xian Li | Paul Michel | Antonios Anastasopoulos | Yonatan Belinkov | Nadir Durrani | Orhan Firat | Philipp Koehn | Graham Neubig | Juan Pino | Hassan Sajjad
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
We share the findings of the first shared task on improving robustness of Machine Translation (MT). The task provides a testbed representing challenges facing MT models deployed in the real world, and facilitates new approaches to improve models’ robustness to noisy input and domain mismatch. We focus on two language pairs (English-French and English-Japanese), and the submitted systems are evaluated on a blind test set consisting of noisy comments on Reddit and professionally sourced translations. As a new task, we received 23 submissions by 11 participating teams from universities, companies, national labs, etc. All submitted systems achieved large improvements over baselines, with the best improvement having +22.33 BLEU. We evaluated submissions by both human judgment and automatic evaluation (BLEU), which shows high correlations (Pearson’s r = 0.94 and 0.95). Furthermore, we conducted a qualitative analysis of the submitted systems using compare-mt, which revealed their salient differences in handling challenges in this task. Such analysis provides additional insights when there is occasional disagreement between human judgment and BLEU, e.g. systems better at producing colloquial expressions received higher score from human judgment.
Johns Hopkins University Submission for WMT News Translation Task
Kelly Marchisio | Yash Kumar Lal | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Kelly Marchisio | Yash Kumar Lal | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
We describe the work of Johns Hopkins University for the shared task of news translation organized by the Fourth Conference on Machine Translation (2019). We submitted systems for both directions of the English-German language pair. The systems combine multiple techniques – sampling, filtering, iterative backtranslation, and continued training – previously used to improve performance of neural machine translation models. At submission time, we achieve a BLEU score of 38.1 for De-En and 42.5 for En-De translation directions on newstest2019. Post-submission, the score is 38.4 for De-En and 42.8 for En-De. Various experiments conducted in the process are also described.
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions
Philipp Koehn | Francisco Guzmán | Vishrav Chaudhary | Juan Pino
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Philipp Koehn | Francisco Guzmán | Vishrav Chaudhary | Juan Pino
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali-English and Sinhala-English. Eleven participants from companies, national research labs, and universities participated in this task.
Low-Resource Corpus Filtering Using Multilingual Sentence Embeddings
Vishrav Chaudhary | Yuqing Tang | Francisco Guzmán | Holger Schwenk | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Vishrav Chaudhary | Yuqing Tang | Francisco Guzmán | Holger Schwenk | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
In this paper, we describe our submission to the WMT19 low-resource parallel corpus filtering shared task. Our main approach is based on the LASER toolkit (Language-Agnostic SEntence Representations), which uses an encoder-decoder architecture trained on a parallel corpus to obtain multilingual sentence representations. We then use the representations directly to score and filter the noisy parallel sentences without additionally training a scoring function. We contrast our approach to other promising methods and show that LASER yields strong results. Finally, we produce an ensemble of different scoring methods and obtain additional gains. Our submission achieved the best overall performance for both the Nepali-English and Sinhala-English 1M tasks by a margin of 1.3 and 1.4 BLEU respectively, as compared to the second best systems. Moreover, our experiments show that this technique is promising for low and even no-resource scenarios.
Robust Document Representations for Cross-Lingual Information Retrieval in Low-Resource Settings
Mahsa Yarmohammadi | Xutai Ma | Sorami Hisamoto | Muhammad Rahman | Yiming Wang | Hainan Xu | Daniel Povey | Philipp Koehn | Kevin Duh
Proceedings of Machine Translation Summit XVII: Research Track
Mahsa Yarmohammadi | Xutai Ma | Sorami Hisamoto | Muhammad Rahman | Yiming Wang | Hainan Xu | Daniel Povey | Philipp Koehn | Kevin Duh
Proceedings of Machine Translation Summit XVII: Research Track
Controlling the Reading Level of Machine Translation Output
Kelly Marchisio | Jialiang Guo | Cheng-I Lai | Philipp Koehn
Proceedings of Machine Translation Summit XVII: Research Track
Kelly Marchisio | Jialiang Guo | Cheng-I Lai | Philipp Koehn
Proceedings of Machine Translation Summit XVII: Research Track
Character-Aware Decoder for Translation into Morphologically Rich Languages
Adithya Renduchintala | Pamela Shapiro | Kevin Duh | Philipp Koehn
Proceedings of Machine Translation Summit XVII: Research Track
Adithya Renduchintala | Pamela Shapiro | Kevin Duh | Philipp Koehn
Proceedings of Machine Translation Summit XVII: Research Track
2018
An Analysis of Source Context Dependency in Neural Machine Translation
Xutai Ma | Ke Li | Philipp Koehn
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
Xutai Ma | Ke Li | Philipp Koehn
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
The encoder-decoder with attention model has become the state of the art for machine translation. However, more investigations are still needed to understand the internal mechanism of this end-to-end model. In this paper, we focus on how neural machine translation (NMT) models consider source information while decoding. We propose a numerical measurement of source context dependency in the NMT models and analyze the behaviors of the NMT decoder with this measurement under several circumstances. Experimental results show that this measurement is an appropriate estimate for source context dependency and consistent over different domains.
Context and Copying in Neural Machine Translation
Rebecca Knowles | Philipp Koehn
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Rebecca Knowles | Philipp Koehn
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Neural machine translation systems with subword vocabularies are capable of translating or copying unknown words. In this work, we show that they learn to copy words based on both the context in which the words appear as well as features of the words themselves. In contexts that are particularly copy-prone, they even copy words that they have already learned they should translate. We examine the influence of context and subword features on this and other types of copying behavior.
Exploring Word Sense Disambiguation Abilities of Neural Machine Translation Systems (Non-archival Extended Abstract)
Rebecca Marvin | Philipp Koehn
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Rebecca Marvin | Philipp Koehn
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Lightweight Word-Level Confidence Estimation for Neural Interactive Translation Prediction
Rebecca Knowles | Philipp Koehn
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing
Rebecca Knowles | Philipp Koehn
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing
A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair
Rebecca Knowles | John Ortega | Philipp Koehn
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing
Rebecca Knowles | John Ortega | Philipp Koehn
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing
Iterative Back-Translation for Neural Machine Translation
Vu Cong Duy Hoang | Philipp Koehn | Gholamreza Haffari | Trevor Cohn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Vu Cong Duy Hoang | Philipp Koehn | Gholamreza Haffari | Trevor Cohn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
We present iterative back-translation, a method for generating increasingly better synthetic parallel data from monolingual data to train neural machine translation systems. Our proposed method is very simple yet effective and highly applicable in practice. We demonstrate improvements in neural machine translation quality in both high and low resourced scenarios, including the best reported BLEU scores for the WMT 2017 German↔English tasks.
Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
Huda Khayrallah | Brian Thompson | Kevin Duh | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Huda Khayrallah | Brian Thompson | Kevin Duh | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Supervised domain adaptation—where a large generic corpus and a smaller in-domain corpus are both available for training—is a challenge for neural machine translation (NMT). Standard practice is to train a generic model and use it to initialize a second model, then continue training the second model on in-domain data to produce an in-domain model. We add an auxiliary term to the training objective during continued training that minimizes the cross entropy between the in-domain model’s output word distribution and that of the out-of-domain model to prevent the model’s output from differing too much from the original out-of-domain model. We perform experiments on EMEA (descriptions of medicines) and TED (rehearsed presentations), initialized from a general domain (WMT) model. Our method shows improvements over standard continued training by up to 1.5 BLEU.
Document-Level Adaptation for Neural Machine Translation
Sachith Sri Ram Kothur | Rebecca Knowles | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Sachith Sri Ram Kothur | Rebecca Knowles | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
It is common practice to adapt machine translation systems to novel domains, but even a well-adapted system may be able to perform better on a particular document if it were to learn from a translator’s corrections within the document itself. We focus on adaptation within a single document – appropriate for an interactive translation scenario where a model adapts to a human translator’s input over the course of a document. We propose two methods: single-sentence adaptation (which performs online adaptation one sentence at a time) and dictionary adaptation (which specifically addresses the issue of translating novel words). Combining the two models results in improvements over both approaches individually, and over baseline systems, even on short documents. On WMT news test data, we observe an improvement of +1.8 BLEU points and +23.3% novel word translation accuracy and on EMEA data (descriptions of medications) we observe an improvement of +2.7 BLEU points and +49.2% novel word translation accuracy.
On the Impact of Various Types of Noise on Neural Machine Translation
Huda Khayrallah | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Huda Khayrallah | Philipp Koehn
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
We examine how various types of noise in the parallel training data impact the quality of neural machine translation systems. We create five types of artificial noise and analyze how they degrade performance in neural and statistical machine translation. We find that neural models are generally more harmed by noise than statistical models. For one especially egregious type of noise they learn to just copy the input sentence.
Proceedings of the Third Conference on Machine Translation: Research Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Research Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Research Papers
Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation
Brian Thompson | Huda Khayrallah | Antonios Anastasopoulos | Arya D. McCarthy | Kevin Duh | Rebecca Marvin | Paul McNamee | Jeremy Gwinnup | Tim Anderson | Philipp Koehn
Proceedings of the Third Conference on Machine Translation: Research Papers
Brian Thompson | Huda Khayrallah | Antonios Anastasopoulos | Arya D. McCarthy | Kevin Duh | Rebecca Marvin | Paul McNamee | Jeremy Gwinnup | Tim Anderson | Philipp Koehn
Proceedings of the Third Conference on Machine Translation: Research Papers
To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component’s contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Findings of the 2018 Conference on Machine Translation (WMT18)
Ondřej Bojar | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Christof Monz
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Ondřej Bojar | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Christof Monz
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018. Participants were asked to build machine translation systems for any of 7 language pairs in both directions, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. This year, we also opened up the task to additional test sets to probe specific aspects of translation.
The JHU Machine Translation Systems for WMT 2018
Philipp Koehn | Kevin Duh | Brian Thompson
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Philipp Koehn | Kevin Duh | Brian Thompson
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
We report on the efforts of the Johns Hopkins University to develop neural machine translation systems for the shared task for news translation organized around the Conference for Machine Translation (WMT) 2018. We developed systems for German–English, English– German, and Russian–English. Our novel contributions are iterative back-translation and fine-tuning on test sets from prior years.
Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering
Philipp Koehn | Huda Khayrallah | Kenneth Heafield | Mikel L. Forcada
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Philipp Koehn | Huda Khayrallah | Kenneth Heafield | Mikel L. Forcada
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1% and 10% of high-quality data to be used to train machine translation systems. Seventeen participants from companies, national research labs, and universities participated in this task.
The JHU Parallel Corpus Filtering Systems for WMT 2018
Huda Khayrallah | Hainan Xu | Philipp Koehn
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Huda Khayrallah | Hainan Xu | Philipp Koehn
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
This work describes our submission to the WMT18 Parallel Corpus Filtering shared task. We use a slightly modified version of the Zipporah Corpus Filtering toolkit (Xu and Koehn, 2017), which computes an adequacy score and a fluency score on a sentence pair, and use a weighted sum of the scores as the selection criteria. This work differs from Zipporah in that we experiment with using the noisy corpus to be filtered to compute the combination weights, and thus avoids generating synthetic data as in standard Zipporah.
2017
Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora
Hainan Xu | Philipp Koehn
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Hainan Xu | Philipp Koehn
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
We introduce Zipporah, a fast and scalable data cleaning system. We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the proposed feature space. The trained model is used to score parallel sentences in the data pool for selection. As shown in experiments, Zipporah selects a high-quality parallel corpus from a large, mixed quality data pool. In particular, for one noisy dataset, Zipporah achieves a 2.1 BLEU score improvement with using 1/5 of the data over using the entire corpus.
Neural Lattice Search for Domain Adaptation in Machine Translation
Huda Khayrallah | Gaurav Kumar | Kevin Duh | Matt Post | Philipp Koehn
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Huda Khayrallah | Gaurav Kumar | Kevin Duh | Matt Post | Philipp Koehn
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Domain adaptation is a major challenge for neural machine translation (NMT). Given unknown words or new domains, NMT systems tend to generate fluent translations at the expense of adequacy. We present a stack-based lattice search algorithm for NMT and show that constraining its search space with lattices generated by phrase-based machine translation (PBMT) improves robustness. We report consistent BLEU score gains across four diverse domain adaptation tasks involving medical, IT, Koran, or subtitles texts.
CADET: Computer Assisted Discovery Extraction and Translation
Benjamin Van Durme | Tom Lippincott | Kevin Duh | Deana Burchfield | Adam Poliak | Cash Costello | Tim Finin | Scott Miller | James Mayfield | Philipp Koehn | Craig Harman | Dawn Lawrie | Chandler May | Max Thomas | Annabelle Carrell | Julianne Chaloux | Tongfei Chen | Alex Comerford | Mark Dredze | Benjamin Glass | Shudong Hao | Patrick Martin | Pushpendre Rastogi | Rashmi Sankepally | Travis Wolfe | Ying-Ying Tran | Ted Zhang
Proceedings of the IJCNLP 2017, System Demonstrations
Benjamin Van Durme | Tom Lippincott | Kevin Duh | Deana Burchfield | Adam Poliak | Cash Costello | Tim Finin | Scott Miller | James Mayfield | Philipp Koehn | Craig Harman | Dawn Lawrie | Chandler May | Max Thomas | Annabelle Carrell | Julianne Chaloux | Tongfei Chen | Alex Comerford | Mark Dredze | Benjamin Glass | Shudong Hao | Patrick Martin | Pushpendre Rastogi | Rashmi Sankepally | Travis Wolfe | Ying-Ying Tran | Ted Zhang
Proceedings of the IJCNLP 2017, System Demonstrations
Computer Assisted Discovery Extraction and Translation (CADET) is a workbench for helping knowledge workers find, label, and translate documents of interest. It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users. This open-source framework allows for easy development of new research prototypes using a micro-service architecture based atop Docker and Apache Thrift.
Knowledge Tracing in Sequential Learning of Inflected Vocabulary
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
We present a feature-rich knowledge tracing method that captures a student’s acquisition and retention of knowledge during a foreign language phrase learning task. We model the student’s behavior as making predictions under a log-linear model, and adopt a neural gating mechanism to model how the student updates their log-linear parameters in response to feedback. The gating mechanism allows the model to learn complex patterns of retention and acquisition for each feature, while the log-linear parameterization results in an interpretable knowledge state. We collect human data and evaluate several versions of the model.
Six Challenges for Neural Machine Translation
Philipp Koehn | Rebecca Knowles
Proceedings of the First Workshop on Neural Machine Translation
Philipp Koehn | Rebecca Knowles
Proceedings of the First Workshop on Neural Machine Translation
We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search. We show both deficiencies and improvements over the quality of phrase-based statistical machine translation.
Proceedings of the Second Conference on Machine Translation
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Julia Kreutzer
Proceedings of the Second Conference on Machine Translation
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Julia Kreutzer
Proceedings of the Second Conference on Machine Translation
Predicting Target Language CCG Supertags Improves Neural Machine Translation
Maria Nădejde | Siva Reddy | Rico Sennrich | Tomasz Dwojak | Marcin Junczys-Dowmunt | Philipp Koehn | Alexandra Birch
Proceedings of the Second Conference on Machine Translation
Maria Nădejde | Siva Reddy | Rico Sennrich | Tomasz Dwojak | Marcin Junczys-Dowmunt | Philipp Koehn | Alexandra Birch
Proceedings of the Second Conference on Machine Translation
Findings of the 2017 Conference on Machine Translation (WMT17)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Shujian Huang | Matthias Huck | Philipp Koehn | Qun Liu | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Raphael Rubino | Lucia Specia | Marco Turchi
Proceedings of the Second Conference on Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Shujian Huang | Matthias Huck | Philipp Koehn | Qun Liu | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Raphael Rubino | Lucia Specia | Marco Turchi
Proceedings of the Second Conference on Machine Translation
The JHU Machine Translation Systems for WMT 2017
Shuoyang Ding | Huda Khayrallah | Philipp Koehn | Matt Post | Gaurav Kumar | Kevin Duh
Proceedings of the Second Conference on Machine Translation
Shuoyang Ding | Huda Khayrallah | Philipp Koehn | Matt Post | Gaurav Kumar | Kevin Duh
Proceedings of the Second Conference on Machine Translation
2016
Machine Translation Quality and Post-Editor Productivity
Marina Sanchez-Torron | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
Marina Sanchez-Torron | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
We assessed how different machine translation (MT) systems affect the post-editing (PE) process and product of professional English–Spanish translators. Our model found that for each 1-point increase in BLEU, there is a PE time decrease of 0.16 seconds per word, about 3-4%. The MT system with the lowest BLEU score produced the output that was post-edited to the lowest quality and with the highest PE effort, measured both in HTER and actual PE operations.
Neural Interactive Translation Prediction
Rebecca Knowles | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
Rebecca Knowles | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
We present an interactive translation prediction method based on neural machine translation. Even with the same translation quality of the underlying machine translation systems, the neural prediction method yields much higher word prediction accuracy (61.6% vs. 43.3%) than the traditional method based on search graphs, mainly due to better recovery from errors. We also develop efficient means to enable practical deployment.
Translation of Unknown Words in Low Resource Languages
Biman Gujral | Huda Khayrallah | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
Biman Gujral | Huda Khayrallah | Philipp Koehn
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
A Neural Verb Lexicon Model with Source-side Syntactic Context for String-to-Tree Machine Translation
Maria Nădejde | Alexandra Birch | Philipp Koehn
Proceedings of the 13th International Conference on Spoken Language Translation
Maria Nădejde | Alexandra Birch | Philipp Koehn
Proceedings of the 13th International Conference on Spoken Language Translation
String-to-tree MT systems translate verbs without lexical or syntactic context on the source side and with limited target-side context. The lack of context is one reason why verb translation recall is as low as 45.5%. We propose a verb lexicon model trained with a feed-forward neural network that predicts the target verb conditioned on a wide source-side context. We show that a syntactic context extracted from the dependency parse of the source sentence improves the model’s accuracy by 1.5% over a baseline trained on a window context. When used as an extra feature for re-ranking the n-best list produced by the string-to-tree MT system, the verb lexicon model improves verb translation recall by more than 7%.
Analyzing Learner Understanding of Novel L2 Vocabulary
Rebecca Knowles | Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning
Rebecca Knowles | Adithya Renduchintala | Philipp Koehn | Jason Eisner
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning
User Modeling in Language Learning with Macaronic Texts
Adithya Renduchintala | Rebecca Knowles | Philipp Koehn | Jason Eisner
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Adithya Renduchintala | Rebecca Knowles | Philipp Koehn | Jason Eisner
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Creating Interactive Macaronic Interfaces for Language Learning
Adithya Renduchintala | Rebecca Knowles | Philipp Koehn | Jason Eisner
Proceedings of ACL-2016 System Demonstrations
Adithya Renduchintala | Rebecca Knowles | Philipp Koehn | Jason Eisner
Proceedings of ACL-2016 System Demonstrations
Computer Aided Translation
Philipp Koehn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Philipp Koehn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Moving beyond post-editing machine translation, a number of recent research efforts have advanced computer aided translation methods that allow for more interactivity, richer information such as confidence scores, and the completed feedback loop of instant adaptation of machine translation models to user translations.This tutorial will explain the main techniques for several aspects of computer aided translation: confidence measures;interactive machine translation (interactive translation prediction);bilingual concordancers;translation option display;paraphrasing (alternative translation suggestions);visualization of word alignment;online adaptation;automatic reviewing;integration of translation memory;eye tracking, logging, and cognitive user models;For each of these, the state of the art and open challenges are presented. The tutorial will also look under the hood of the open source CASMACAT toolkit that is based on MATECAT, and available as a "Home Edition" to be installed on a desktop machine. The target audience of this tutorials are researchers interested in computer aided machine translation and practitioners who want to use or deploy advanced CAT technology.
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Modeling Selectional Preferences of Verbs and Nouns in String-to-Tree Machine Translation
Maria Nădejde | Alexandra Birch | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Maria Nădejde | Alexandra Birch | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Findings of the 2016 Conference on Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Martin Popel | Matt Post | Raphael Rubino | Carolina Scarton | Lucia Specia | Marco Turchi | Karin Verspoor | Marcos Zampieri
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Martin Popel | Matt Post | Raphael Rubino | Carolina Scarton | Lucia Specia | Marco Turchi | Karin Verspoor | Marcos Zampieri
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
The JHU Machine Translation Systems for WMT 2016
Shuoyang Ding | Kevin Duh | Huda Khayrallah | Philipp Koehn | Matt Post
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Shuoyang Ding | Kevin Duh | Huda Khayrallah | Philipp Koehn | Matt Post
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Findings of the WMT 2016 Bilingual Document Alignment Shared Task
Christian Buck | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Christian Buck | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Quick and Reliable Document Alignment via TF/IDF-weighted Cosine Distance
Christian Buck | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Christian Buck | Philipp Koehn
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
2015
The Operation Sequence Model—Combining N-Gram-Based and Phrase-Based Statistical Machine Translation
Nadir Durrani | Helmut Schmid | Alexander Fraser | Philipp Koehn | Hinrich Schütze
Computational Linguistics, Volume 41, Issue 2 - June 2015
Nadir Durrani | Helmut Schmid | Alexander Fraser | Philipp Koehn | Hinrich Schütze
Computational Linguistics, Volume 41, Issue 2 - June 2015
Findings of the 2015 Workshop on Statistical Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Barry Haddow | Matthias Huck | Chris Hokamp | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Carolina Scarton | Lucia Specia | Marco Turchi
Proceedings of the Tenth Workshop on Statistical Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Barry Haddow | Matthias Huck | Chris Hokamp | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Carolina Scarton | Lucia Specia | Marco Turchi
Proceedings of the Tenth Workshop on Statistical Machine Translation
The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT 2015
Barry Haddow | Matthias Huck | Alexandra Birch | Nikolay Bogoychev | Philipp Koehn
Proceedings of the Tenth Workshop on Statistical Machine Translation
Barry Haddow | Matthias Huck | Alexandra Birch | Nikolay Bogoychev | Philipp Koehn
Proceedings of the Tenth Workshop on Statistical Machine Translation
Edinburgh’s Syntax-Based Systems at WMT 2015
Philip Williams | Rico Sennrich | Maria Nadejde | Matthias Huck | Philipp Koehn
Proceedings of the Tenth Workshop on Statistical Machine Translation
Philip Williams | Rico Sennrich | Maria Nadejde | Matthias Huck | Philipp Koehn
Proceedings of the Tenth Workshop on Statistical Machine Translation
Results of the WMT15 Metrics Shared Task
Miloš Stanojević | Amir Kamran | Philipp Koehn | Ondřej Bojar
Proceedings of the Tenth Workshop on Statistical Machine Translation
Miloš Stanojević | Amir Kamran | Philipp Koehn | Ondřej Bojar
Proceedings of the Tenth Workshop on Statistical Machine Translation
2014
Combining domain and topic adaptation for SMT
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Recent years have seen increased interest in adapting translation models to test domains that are known in advance as well as using latent topic representations to adapt to unknown test domains. However, the relationship between domains and latent topics is still somewhat unclear and topic adaptation approaches typically do not make use of domain knowledge in the training data. We show empirically that combining domain and topic adaptation approaches can be beneficial and that topic representations can be used to predict the domain of a test document. Our best combined model yields gains of up to 0.82 BLEU over a domain-adapted translation system and up to 1.67 BLEU over an unadapted system, measured on the stronger of two training conditions.
Statistical machine translation with the Moses toolkit
Hieu Hoang | Matthias Huck | Philipp Koehn
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: Tutorials
Hieu Hoang | Matthias Huck | Philipp Koehn
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: Tutorials
CASMACAT: cognitive analysis and statistical methods for advanced computer aided translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Improving machine translation via triangulation and transliteration
Nadir Durrani | Philipp Koehn
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Nadir Durrani | Philipp Koehn
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Edinburgh SLT and MT system description for the IWSLT 2014 evaluation
Alexandra Birch | Matthias Huck | Nadir Durrani | Nikolay Bogoychev | Philipp Koehn
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign
Alexandra Birch | Matthias Huck | Nadir Durrani | Nikolay Bogoychev | Philipp Koehn
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper describes the University of Edinburgh’s spoken language translation (SLT) and machine translation (MT) systems for the IWSLT 2014 evaluation campaign. In the SLT track, we participated in the German↔English and English→French tasks. In the MT track, we participated in the German↔English, English→French, Arabic↔English, Farsi→English, Hebrew→English, Spanish↔English, and Portuguese-Brazil↔English tasks. For our SLT submissions, we experimented with comparing operation sequence models with bilingual neural network language models. For our MT submissions, we explored using unsupervised transliteration for languages which have a different script than English, in particular for Arabic, Farsi, and Hebrew. We also investigated syntax-based translation and system combination.
Combined spoken language translation
Markus Freitag | Joern Wuebker | Stephan Peitz | Hermann Ney | Matthias Huck | Alexandra Birch | Nadir Durrani | Philipp Koehn | Mohammed Mediani | Isabel Slawik | Jan Niehues | Eunah Cho | Alex Waibel | Nicola Bertoldi | Mauro Cettolo | Marcello Federico
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign
Markus Freitag | Joern Wuebker | Stephan Peitz | Hermann Ney | Matthias Huck | Alexandra Birch | Nadir Durrani | Philipp Koehn | Mohammed Mediani | Isabel Slawik | Jan Niehues | Eunah Cho | Alex Waibel | Nicola Bertoldi | Mauro Cettolo | Marcello Federico
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign
EU-BRIDGE is a European research project which is aimed at developing innovative speech translation technology. One of the collaborative efforts within EU-BRIDGE is to produce joint submissions of up to four different partners to the evaluation campaign at the 2014 International Workshop on Spoken Language Translation (IWSLT). We submitted combined translations to the German→English spoken language translation (SLT) track as well as to the German→English, English→German and English→French machine translation (MT) tracks. In this paper, we present the techniques which were applied by the different individual translation systems of RWTH Aachen University, the University of Edinburgh, Karlsruhe Institute of Technology, and Fondazione Bruno Kessler. We then show the combination approach developed at RWTH Aachen University which combined the individual systems. The consensus translations yield empirical gains of up to 2.3 points in BLEU and 1.2 points in TER compared to the best individual system.
Investigating the Usefulness of Generalized Word Representations in SMT
Nadir Durrani | Philipp Koehn | Helmut Schmid | Alexander Fraser
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Nadir Durrani | Philipp Koehn | Helmut Schmid | Alexander Fraser
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
The MateCat Tool
Marcello Federico | Nicola Bertoldi | Mauro Cettolo | Matteo Negri | Marco Turchi | Marco Trombetti | Alessandro Cattelan | Antonio Farina | Domenico Lupinetti | Andrea Martines | Alberto Massidda | Holger Schwenk | Loïc Barrault | Frederic Blain | Philipp Koehn | Christian Buck | Ulrich Germann
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations
Marcello Federico | Nicola Bertoldi | Mauro Cettolo | Matteo Negri | Marco Turchi | Marco Trombetti | Alessandro Cattelan | Antonio Farina | Domenico Lupinetti | Andrea Martines | Alberto Massidda | Holger Schwenk | Loïc Barrault | Frederic Blain | Philipp Koehn | Christian Buck | Ulrich Germann
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations
Syntax-Based Statistical Machine Translation
Philip Williams | Philipp Koehn
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Philip Williams | Philipp Koehn
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
The tutorial explains in detail syntax-based statistical machine translation with synchronous context free grammars (SCFG). It is aimed at researchers who have little background in this area, and gives a comprehensive overview about the main models and methods.While syntax-based models in statistical machine translation have a long history, spanning back almost 20 years, they have only recently shown superior translation quality over the more commonly used phrase-based models, and are now considered state of the art for some language pairs, such as Chinese-English (since ISI's submission to NIST 2006), and English-German (since Edinburgh's submission to WMT 2012).While the field is very dynamic, there is a core set of methods that have become dominant. Such SCFG models are implemented in the open source machine translation toolkit Moses, and the tutors draw from the practical experience of its development.The tutorial focuses on explaining core established concepts in SCFG-based approaches, which are the most popular in this area. The main goal of the tutorial is for the audience to understand how these systems work end-to-end. We review as much relevant literature as necessary, but the tutorial is not a primarily research survey.The tutorial is rounded up with open problems and advanced topics, such as computational challenges, different formalisms for syntax-based models and inclusion of semantics.
Dynamic Topic Adaptation for Phrase-based MT
Eva Hasler | Phil Blunsom | Philipp Koehn | Barry Haddow
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Eva Hasler | Phil Blunsom | Philipp Koehn | Barry Haddow
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
CASMACAT: A Computer-assisted Translation Workbench
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics
Integrating an Unsupervised Transliteration Model into Statistical Machine Translation
Nadir Durrani | Hassan Sajjad | Hieu Hoang | Philipp Koehn
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
Nadir Durrani | Hassan Sajjad | Hieu Hoang | Philipp Koehn
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
Refinements to Interactive Translation Prediction Based on Search Graphs
Philipp Koehn | Chara Tsoukala | Herve Saint-Amand
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Philipp Koehn | Chara Tsoukala | Herve Saint-Amand
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Ulrich Germann | Michael Carl | Philipp Koehn | Germán Sanchis-Trilles | Francisco Casacuberta | Robin Hill | Sharon O’Brien
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Ulrich Germann | Michael Carl | Philipp Koehn | Germán Sanchis-Trilles | Francisco Casacuberta | Robin Hill | Sharon O’Brien
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
The Impact of Machine Translation Quality on Human Post-Editing
Philipp Koehn | Ulrich Germann
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Philipp Koehn | Ulrich Germann
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Using Feature Structures to Improve Verb Translation in English-to-German Statistical MT
Philip Williams | Philipp Koehn
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)
Philip Williams | Philipp Koehn
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces
Jason Chuang | Spence Green | Marti Hearst | Jeffrey Heer | Philipp Koehn
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces
Jason Chuang | Spence Green | Marti Hearst | Jeffrey Heer | Philipp Koehn
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces
Proceedings of the Ninth Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Lucia Specia
Proceedings of the Ninth Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Lucia Specia
Proceedings of the Ninth Workshop on Statistical Machine Translation
Findings of the 2014 Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Johannes Leveling | Christof Monz | Pavel Pecina | Matt Post | Herve Saint-Amand | Radu Soricut | Lucia Specia | Aleš Tamchyna
Proceedings of the Ninth Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Johannes Leveling | Christof Monz | Pavel Pecina | Matt Post | Herve Saint-Amand | Radu Soricut | Lucia Specia | Aleš Tamchyna
Proceedings of the Ninth Workshop on Statistical Machine Translation
Edinburgh’s Phrase-based Machine Translation Systems for WMT-14
Nadir Durrani | Barry Haddow | Philipp Koehn | Kenneth Heafield
Proceedings of the Ninth Workshop on Statistical Machine Translation
Nadir Durrani | Barry Haddow | Philipp Koehn | Kenneth Heafield
Proceedings of the Ninth Workshop on Statistical Machine Translation
EU-BRIDGE MT: Combined Machine Translation
Markus Freitag | Stephan Peitz | Joern Wuebker | Hermann Ney | Matthias Huck | Rico Sennrich | Nadir Durrani | Maria Nadejde | Philip Williams | Philipp Koehn | Teresa Herrmann | Eunah Cho | Alex Waibel
Proceedings of the Ninth Workshop on Statistical Machine Translation
Markus Freitag | Stephan Peitz | Joern Wuebker | Hermann Ney | Matthias Huck | Rico Sennrich | Nadir Durrani | Maria Nadejde | Philip Williams | Philipp Koehn | Teresa Herrmann | Eunah Cho | Alex Waibel
Proceedings of the Ninth Workshop on Statistical Machine Translation
Edinburgh’s Syntax-Based Systems at WMT 2014
Philip Williams | Rico Sennrich | Maria Nadejde | Matthias Huck | Eva Hasler | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Philip Williams | Rico Sennrich | Maria Nadejde | Matthias Huck | Eva Hasler | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Dynamic Topic Adaptation for SMT using Distributional Profiles
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases
Matthias Huck | Hieu Hoang | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Matthias Huck | Hieu Hoang | Philipp Koehn
Proceedings of the Ninth Workshop on Statistical Machine Translation
Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation
Matthias Huck | Hieu Hoang | Philipp Koehn
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
Matthias Huck | Hieu Hoang | Philipp Koehn
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
2013
English SLT and MT system description for the IWSLT 2013 evaluation
Alexandra Birch | Nadir Durrani | Philipp Koehn
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
Alexandra Birch | Nadir Durrani | Philipp Koehn
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper gives a description of the University of Edinburgh’s (UEDIN) systems for IWSLT 2013. We participated in all the MT tracks and the German-to-English and Englishto-French SLT tracks. Our SLT submissions experimented with including ASR uncertainty into the decoding process via confusion networks, and looked at different ways of punctuating ASR output. Our MT submissions are mainly based on a system used in the recent evaluation campaign at the Workshop on Statistical Machine Translation [1]. We additionally explored the use of generalized representations (Brown clusters, POS and morphological tags) translating out of English into European languages.
EU-BRIDGE MT: text translation of talks in the EU-BRIDGE project
Markus Freitag | Stephan Peitz | Joern Wuebker | Hermann Ney | Nadir Durrani | Matthias Huck | Philipp Koehn | Thanh-Le Ha | Jan Niehues | Mohammed Mediani | Teresa Herrmann | Alex Waibel | Nicola Bertoldi | Mauro Cettolo | Marcello Federico
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
Markus Freitag | Stephan Peitz | Joern Wuebker | Hermann Ney | Nadir Durrani | Matthias Huck | Philipp Koehn | Thanh-Le Ha | Jan Niehues | Mohammed Mediani | Teresa Herrmann | Alex Waibel | Nicola Bertoldi | Mauro Cettolo | Marcello Federico
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
EU-BRIDGE1 is a European research project which is aimed at developing innovative speech translation technology. This paper describes one of the collaborative efforts within EUBRIDGE to further advance the state of the art in machine translation between two European language pairs, English→French and German→English. Four research institutions involved in the EU-BRIDGE project combined their individual machine translation systems and participated with a joint setup in the machine translation track of the evaluation campaign at the 2013 International Workshop on Spoken Language Translation (IWSLT). We present the methods and techniques to achieve high translation quality for text translation of talks which are applied at RWTH Aachen University, the University of Edinburgh, Karlsruhe Institute of Technology, and Fondazione Bruno Kessler. We then show how we have been able to considerably boost translation performance (as measured in terms of the metrics BLEU and TER) by means of system combination. The joint setups yield empirical gains of up to 1.4 points in BLEU and 2.8 points in TER on the IWSLT test sets compared to the best single systems.
CASMACAT: Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of Machine Translation Summit XIV: European projects
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of Machine Translation Summit XIV: European projects
Advanced computer aided translation with a web-based workbench
Vicent Alabau | Ragnar Bonk | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Jesús González | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Oriz | Hervé Saint-Amand | Germán Sanchis | Chara Tsiukala
Proceedings of the 2nd Workshop on Post-editing Technology and Practice
Vicent Alabau | Ragnar Bonk | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Jesús González | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Oriz | Hervé Saint-Amand | Germán Sanchis | Chara Tsiukala
Proceedings of the 2nd Workshop on Post-editing Technology and Practice
Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs
Kenneth Heafield | Philipp Koehn | Alon Lavie
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Kenneth Heafield | Philipp Koehn | Alon Lavie
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Dirt Cheap Web-Scale Parallel Text from the Common Crawl
Jason R. Smith | Herve Saint-Amand | Magdalena Plamada | Philipp Koehn | Chris Callison-Burch | Adam Lopez
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jason R. Smith | Herve Saint-Amand | Magdalena Plamada | Philipp Koehn | Chris Callison-Burch | Adam Lopez
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Learning to Prune: Context-Sensitive Pruning for Syntactic MT
Wenduan Xu | Yue Zhang | Philip Williams | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Wenduan Xu | Yue Zhang | Philip Williams | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
Nadir Durrani | Alexander Fraser | Helmut Schmid | Hieu Hoang | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Nadir Durrani | Alexander Fraser | Helmut Schmid | Hieu Hoang | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Scalable Modified Kneser-Ney Language Model Estimation
Kenneth Heafield | Ivan Pouzyrevsky | Jonathan H. Clark | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Kenneth Heafield | Ivan Pouzyrevsky | Jonathan H. Clark | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Proceedings of the Eighth Workshop on Statistical Machine Translation
Ondrej Bojar | Christian Buck | Chris Callison-Burch | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Herve Saint-Amand | Radu Soricut | Lucia Specia
Proceedings of the Eighth Workshop on Statistical Machine Translation
Ondrej Bojar | Christian Buck | Chris Callison-Burch | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Herve Saint-Amand | Radu Soricut | Lucia Specia
Proceedings of the Eighth Workshop on Statistical Machine Translation
Findings of the 2013 Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Chris Callison-Burch | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Eighth Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Chris Callison-Burch | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Eighth Workshop on Statistical Machine Translation
The Feasibility of HMEANT as a Human MT Evaluation Metric
Alexandra Birch | Barry Haddow | Ulrich Germann | Maria Nadejde | Christian Buck | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Alexandra Birch | Barry Haddow | Ulrich Germann | Maria Nadejde | Christian Buck | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Edinburgh’s Machine Translation Systems for European Language Pairs
Nadir Durrani | Barry Haddow | Kenneth Heafield | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Nadir Durrani | Barry Haddow | Kenneth Heafield | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Edinburgh’s Syntax-Based Machine Translation Systems
Maria Nadejde | Philip Williams | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Maria Nadejde | Philip Williams | Philipp Koehn
Proceedings of the Eighth Workshop on Statistical Machine Translation
Abstract Meaning Representation for Sembanking
Laura Banarescu | Claire Bonial | Shu Cai | Madalina Georgescu | Kira Griffitt | Ulf Hermjakob | Kevin Knight | Philipp Koehn | Martha Palmer | Nathan Schneider
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse
Laura Banarescu | Claire Bonial | Shu Cai | Madalina Georgescu | Kira Griffitt | Ulf Hermjakob | Kevin Knight | Philipp Koehn | Martha Palmer | Nathan Schneider
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse
2012
Interpolated Backoff for Factored Translation Models
Philipp Koehn | Barry Haddow
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers
Philipp Koehn | Barry Haddow
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers
We propose interpolated backoff methods to strike the balance between traditional surface form translation models and factored models that decompose translation into lemma and morphological feature mapping steps. We show that this approach improves translation quality by 0.5 BLEU (German–English) over phrase-based models, due to the better translation of rare nouns and adjectives.
Open Source Statistical Machine Translation
Philipp Koehn | Hieu Hoang
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Tutorials
Philipp Koehn | Hieu Hoang
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Tutorials
If you are interested in open-source machine translation but lack hands-on experience, this is the tutorial for you! We will start with background knowledge of statistical machine translation and then walk you through the process of installing and running an SMT system. We will show you how to prepare input data, and the most efficient way to train and use your translation systems. We shall also discuss solutions to some of the most common issues that face LSPs when using SMT, including how to tailor systems to specific clients, preserving document layout and formatting, and efficient ways of incorporating new translation memories. Previous years’ participants have included software engineers and managers who need to have a detailed understanding of the SMT process. This is a fast-paced, hands-on tutorial that will cover the skills you need to get you up and running with open-source SMT. The teaching will be based on the Moses toolkit, the most popular open-source machine translation software currently available. No prior knowledge of MT is necessary, only an interest in it. A laptop is required for this tutorial, and you should have rudimentary knowledge of using the command line on Windows or Linux.
The UEDIN systems for the IWSLT 2012 evaluation
Eva Hasler | Peter Bell | Arnab Ghoshal | Barry Haddow | Philipp Koehn | Fergus McInnes | Steve Renals | Pawel Swietojanski
Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign
Eva Hasler | Peter Bell | Arnab Ghoshal | Barry Haddow | Philipp Koehn | Fergus McInnes | Steve Renals | Pawel Swietojanski
Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper describes the University of Edinburgh (UEDIN) systems for the IWSLT 2012 Evaluation. We participated in the ASR (English), MT (English-French, German-English) and SLT (English-French) tracks.
Simulating human judgment in machine translation evaluation campaigns
Philipp Koehn
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
Philipp Koehn
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
We present a Monte Carlo model to simulate human judgments in machine translation evaluation campaigns, such as WMT or IWSLT. We use the model to compare different ranking methods and to give guidance on the number of judgments that need to be collected to obtain sufficiently significant distinctions between systems.
Sparse lexicalised features and topic adaptation for SMT
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
Eva Hasler | Barry Haddow | Philipp Koehn
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
We present a new approach to domain adaptation for SMT that enriches standard phrase-based models with lexicalised word and phrase pair features to help the model select appropriate translations for the target domain (TED talks). In addition, we show how source-side sentence-level topics can be incorporated to make the features differentiate between more fine-grained topics within the target domain (topic adaptation). We compare tuning our sparse features on a development set versus on the entire in-domain corpus and introduce a new method of porting them to larger mixed-domain models. Experimental results show that our features improve performance over a MIRA baseline and that in some cases we can get additional improvements with topic features. We evaluate our methods on two language pairs, English-French and German-English, showing promising results.
Language Model Rest Costs and Space-Efficient Storage
Kenneth Heafield | Philipp Koehn | Alon Lavie
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Kenneth Heafield | Philipp Koehn | Alon Lavie
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Proceedings of the Seventh Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Seventh Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Seventh Workshop on Statistical Machine Translation
Findings of the 2012 Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Seventh Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Seventh Workshop on Statistical Machine Translation
Towards Effective Use of Training Data in Statistical Machine Translation
Philipp Koehn | Barry Haddow
Proceedings of the Seventh Workshop on Statistical Machine Translation
Philipp Koehn | Barry Haddow
Proceedings of the Seventh Workshop on Statistical Machine Translation
GHKM Rule Extraction and Scope-3 Parsing in Moses
Philip Williams | Philipp Koehn
Proceedings of the Seventh Workshop on Statistical Machine Translation
Philip Williams | Philipp Koehn
Proceedings of the Seventh Workshop on Statistical Machine Translation
Analysing the Effect of Out-of-Domain Data on SMT Systems
Barry Haddow | Philipp Koehn
Proceedings of the Seventh Workshop on Statistical Machine Translation
Barry Haddow | Philipp Koehn
Proceedings of the Seventh Workshop on Statistical Machine Translation
2011
Left language model state for syntactic machine translation
Kenneth Heafield | Hieu Hoang | Philipp Koehn | Tetsuo Kiso | Marcello Federico
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign
Kenneth Heafield | Hieu Hoang | Philipp Koehn | Tetsuo Kiso | Marcello Federico
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign
Many syntactic machine translation decoders, including Moses, cdec, and Joshua, implement bottom-up dynamic programming to integrate N-gram language model probabilities into hypothesis scoring. These decoders concatenate hypotheses according to grammar rules, yielding larger hypotheses and eventually complete translations. When hypotheses are concatenated, the language model score is adjusted to account for boundary-crossing n-grams. Words on the boundary of each hypothesis are encoded in state, consisting of left state (the first few words) and right state (the last few words). We speed concatenation by encoding left state using data structure pointers in lieu of vocabulary indices and by avoiding unnecessary queries. To increase the decoder’s opportunities to recombine hypothesis, we minimize the number of words encoded by left state. This has the effect of reducing search errors made by the decoder. The resulting gain in model score is smaller than for right state minimization, which we explain by observing a relationship between state minimization and language model probability. With a fixed cube pruning pop limit, we show a 3-6% reduction in CPU time and improved model scores. Reducing the pop limit to the point where model scores tie the baseline yields a net 11% reduction in CPU time.
Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Yang Gao | Philipp Koehn | Alexandra Birch
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
Yang Gao | Philipp Koehn | Alexandra Birch
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
Proceedings of the Sixth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Omar F. Zaidan
Proceedings of the Sixth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Omar F. Zaidan
Proceedings of the Sixth Workshop on Statistical Machine Translation
Findings of the 2011 Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Omar Zaidan
Proceedings of the Sixth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Omar Zaidan
Proceedings of the Sixth Workshop on Statistical Machine Translation
Agreement Constraints for Statistical Machine Translation into German
Philip Williams | Philipp Koehn
Proceedings of the Sixth Workshop on Statistical Machine Translation
Philip Williams | Philipp Koehn
Proceedings of the Sixth Workshop on Statistical Machine Translation
SampleRank Training for Phrase-Based Machine Translation
Barry Haddow | Abhishek Arun | Philipp Koehn
Proceedings of the Sixth Workshop on Statistical Machine Translation
Barry Haddow | Abhishek Arun | Philipp Koehn
Proceedings of the Sixth Workshop on Statistical Machine Translation
2010
Fast Approximate String Matching with Suffix Arrays and A* Parsing
Philipp Koehn | Jean Senellart
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
Philipp Koehn | Jean Senellart
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
We present a novel exact solution to the approximate string matching problem in the context of translation memories, where a text segment has to be matched against a large corpus, while allowing for errors. We use suffix arrays to detect exact n-gram matches, A* search heuristics to discard matches and A* parsing to validate candidate segments. The method outperforms the canonical baseline by a factor of 100, with average lookup times of 4.3–247ms for a segment in a realistic scenario.
Machine Translation with Open source Software
Philipp Koehn | Hieu Hoang
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Tutorials
Philipp Koehn | Hieu Hoang
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Tutorials
Convergence of Translation Memory and Statistical Machine Translation
Philipp Koehn | Jean Senellart
Proceedings of the Second Joint EM+/CNGL Workshop: Bringing MT to the User: Research on Integrating MT in the Translation Industry
Philipp Koehn | Jean Senellart
Proceedings of the Second Joint EM+/CNGL Workshop: Bringing MT to the User: Research on Integrating MT in the Translation Industry
We present two methods that merge ideas from statistical machine translation (SMT) and translation memories (TM). We use a TM to retrieve matches for source segments, and replace the mismatched parts with instructions to an SMT system to fill in the gap. We show that for fuzzy matches of over 70%, one method outperforms both SMT and TM baselines.
Enabling Monolingual Translators: Post-Editing vs. Options
Philipp Koehn
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Philipp Koehn
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Chris Callison-Burch | Philipp Koehn | Christof Monz | Kay Peterson | Omar Zaidan
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Chris Callison-Burch | Philipp Koehn | Christof Monz | Kay Peterson | Omar Zaidan
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Kay Peterson | Mark Przybocki | Omar Zaidan
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Chris Callison-Burch | Philipp Koehn | Christof Monz | Kay Peterson | Mark Przybocki | Omar Zaidan
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
More Linguistic Annotation for Statistical Machine Translation
Philipp Koehn | Barry Haddow | Philip Williams | Hieu Hoang
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Philipp Koehn | Barry Haddow | Philip Williams | Hieu Hoang
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Aiding Pronoun Translation with Co-Reference Resolution
Ronan Le Nagard | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Ronan Le Nagard | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
A Unified Approach to Minimum Risk Training and Decoding
Abhishek Arun | Barry Haddow | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Abhishek Arun | Barry Haddow | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Improved Translation with Source Syntax Labels
Hieu Hoang | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Hieu Hoang | Philipp Koehn
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
2009
Human translation and machine translation
Philipp Koehn
Proceedings of the 6th International Workshop on Spoken Language Translation: Plenaries
Philipp Koehn
Proceedings of the 6th International Workshop on Spoken Language Translation: Plenaries
A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation
Hieu Hoang | Philipp Koehn | Adam Lopez
Proceedings of the 6th International Workshop on Spoken Language Translation: Papers
Hieu Hoang | Philipp Koehn | Adam Lopez
Proceedings of the 6th International Workshop on Spoken Language Translation: Papers
Despite many differences between phrase-based, hierarchical, and syntax-based translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit with end-to-end support for all three of these popular models in a single package. This extension substantially lowers the barrier to entry for machine translation research across multiple models.
462 Machine Translation Systems for Europe
Philipp Koehn | Alexandra Birch | Ralf Steinberger
Proceedings of Machine Translation Summit XII: Papers
Philipp Koehn | Alexandra Birch | Ralf Steinberger
Proceedings of Machine Translation Summit XII: Papers
Interactive Assistance to Human Translators using Statistical Machine Translation Methods
Philipp Koehn | Barry Haddow
Proceedings of Machine Translation Summit XII: Papers
Philipp Koehn | Barry Haddow
Proceedings of Machine Translation Summit XII: Papers
Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system
Loic Dugast | Jean Senellart | Philipp Koehn
Proceedings of Machine Translation Summit XII: Posters
Loic Dugast | Jean Senellart | Philipp Koehn
Proceedings of Machine Translation Summit XII: Posters
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
Philipp Koehn | Rada Mihalcea
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
Philipp Koehn | Rada Mihalcea
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
Improving Mid-Range Re-Ordering Using Templates of Factors
Hieu Hoang | Philipp Koehn
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
Hieu Hoang | Philipp Koehn
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
Word Lattices for Multi-Source Translation
Josh Schroeder | Trevor Cohn | Philipp Koehn
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
Josh Schroeder | Trevor Cohn | Philipp Koehn
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
A Web-Based Interactive Computer Aided Translation Tool
Philipp Koehn
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Philipp Koehn
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Topics in Statistical Machine Translation
Kevin Knight | Philipp Koehn
Tutorial Abstracts of ACL-IJCNLP 2009
Kevin Knight | Philipp Koehn
Tutorial Abstracts of ACL-IJCNLP 2009
Proceedings of the Fourth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Fourth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Fourth Workshop on Statistical Machine Translation
Findings of the 2009 Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Fourth Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Fourth Workshop on Statistical Machine Translation
Statistical Post Editing and Dictionary Extraction: Systran/Edinburgh Submissions for ACL-WMT2009
Loic Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Fourth Workshop on Statistical Machine Translation
Loic Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Fourth Workshop on Statistical Machine Translation
Edinburgh’s Submission to all Tracks of the WMT 2009 Shared Task with Reordering and Speed Improvements to Moses
Philipp Koehn | Barry Haddow
Proceedings of the Fourth Workshop on Statistical Machine Translation
Philipp Koehn | Barry Haddow
Proceedings of the Fourth Workshop on Statistical Machine Translation
A Systematic Analysis of Translation Model Search Spaces
Michael Auli | Adam Lopez | Hieu Hoang | Philipp Koehn
Proceedings of the Fourth Workshop on Statistical Machine Translation
Michael Auli | Adam Lopez | Hieu Hoang | Philipp Koehn
Proceedings of the Fourth Workshop on Statistical Machine Translation
Monte Carlo inference and maximization for phrase-based translation
Abhishek Arun | Chris Dyer | Barry Haddow | Phil Blunsom | Adam Lopez | Philipp Koehn
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)
Abhishek Arun | Chris Dyer | Barry Haddow | Phil Blunsom | Adam Lopez | Philipp Koehn
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)
2008
Predicting Success in Machine Translation
Alexandra Birch | Miles Osborne | Philipp Koehn
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Alexandra Birch | Miles Osborne | Philipp Koehn
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Large and Diverse Language Models for Statistical Machine Translation
Holger Schwenk | Philipp Koehn
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II
Holger Schwenk | Philipp Koehn
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II
Enriching Morphologically Poor Languages for Statistical Machine Translation
Eleftherios Avramidis | Philipp Koehn
Proceedings of ACL-08: HLT
Eleftherios Avramidis | Philipp Koehn
Proceedings of ACL-08: HLT
Proceedings of the Third Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder | Cameron Shaw Fordyce
Proceedings of the Third Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Christof Monz | Josh Schroeder | Cameron Shaw Fordyce
Proceedings of the Third Workshop on Statistical Machine Translation
Further Meta-Evaluation of Machine Translation
Chris Callison-Burch | Cameron Fordyce | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Third Workshop on Statistical Machine Translation
Chris Callison-Burch | Cameron Fordyce | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Third Workshop on Statistical Machine Translation
Towards better Machine Translation Quality for the German-English Language Pairs
Philipp Koehn | Abhishek Arun | Hieu Hoang
Proceedings of the Third Workshop on Statistical Machine Translation
Philipp Koehn | Abhishek Arun | Hieu Hoang
Proceedings of the Third Workshop on Statistical Machine Translation
Can we Relearn an RBMT System?
Loïc Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Third Workshop on Statistical Machine Translation
Loïc Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Third Workshop on Statistical Machine Translation
Design of the Moses Decoder for Statistical Machine Translation
Hieu Hoang | Philipp Koehn
Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Hieu Hoang | Philipp Koehn
Software Engineering, Testing, and Quality Assurance for Natural Language Processing
2007
The University of Edinburgh system description for IWSLT 2007
Josh Schroeder | Philipp Koehn
Proceedings of the Fourth International Workshop on Spoken Language Translation
Josh Schroeder | Philipp Koehn
Proceedings of the Fourth International Workshop on Spoken Language Translation
We present the University of Edinburgh’s submission for the IWSLT 2007 shared task. Our efforts focused on adapting our statistical machine translation system to the open data conditions for the Italian-English task of the evaluation campaign. We examine the challenges of building a system with a limited set of in-domain development data (SITAL), a small training corpus in a related but distinct domain (BTEC), and a large out of domain corpus (Europarl). We concentrated on the corrected text track, and present additional results of our experiments using the open-source Moses MT system with speech input.
EuroMatrix – machine translation for all European languages
Philipp Koehn
Proceedings of Machine Translation Summit XI: Invited papers
Philipp Koehn
Proceedings of Machine Translation Summit XI: Invited papers
Online learning methods for discriminative training of phrase based statistical machine translation
Abhishek Arun | Philipp Koehn
Proceedings of Machine Translation Summit XI: Papers
Abhishek Arun | Philipp Koehn
Proceedings of Machine Translation Summit XI: Papers
Statistical machine translation
Kevin Knight | Philipp Koehn
Proceedings of Machine Translation Summit XI: Tutorials
Kevin Knight | Philipp Koehn
Proceedings of Machine Translation Summit XI: Tutorials
Evaluating evaluation – lessons from the WMT 2007 shared task
Philipp Koehn | Chris Callison-Burch
Proceedings of the Workshop on Automatic procedures in MT evaluation
Philipp Koehn | Chris Callison-Burch
Proceedings of the Workshop on Automatic procedures in MT evaluation
Chinese Syntactic Reordering for Statistical Machine Translation
Chao Wang | Michael Collins | Philipp Koehn
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Chao Wang | Michael Collins | Philipp Koehn
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Factored Translation Models
Philipp Koehn | Hieu Hoang
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Philipp Koehn | Hieu Hoang
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Moses: Open Source Toolkit for Statistical Machine Translation
Philipp Koehn | Hieu Hoang | Alexandra Birch | Chris Callison-Burch | Marcello Federico | Nicola Bertoldi | Brooke Cowan | Wade Shen | Christine Moran | Richard Zens | Chris Dyer | Ondřej Bojar | Alexandra Constantin | Evan Herbst
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions
Philipp Koehn | Hieu Hoang | Alexandra Birch | Chris Callison-Burch | Marcello Federico | Nicola Bertoldi | Brooke Cowan | Wade Shen | Christine Moran | Richard Zens | Chris Dyer | Ondřej Bojar | Alexandra Constantin | Evan Herbst
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions
Proceedings of the Second Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Cameron Shaw Fordyce | Christof Monz
Proceedings of the Second Workshop on Statistical Machine Translation
Chris Callison-Burch | Philipp Koehn | Cameron Shaw Fordyce | Christof Monz
Proceedings of the Second Workshop on Statistical Machine Translation
CCG Supertags in Factored Statistical Machine Translation
Alexandra Birch | Miles Osborne | Philipp Koehn
Proceedings of the Second Workshop on Statistical Machine Translation
Alexandra Birch | Miles Osborne | Philipp Koehn
Proceedings of the Second Workshop on Statistical Machine Translation
(Meta-) Evaluation of Machine Translation
Chris Callison-Burch | Cameron Fordyce | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Second Workshop on Statistical Machine Translation
Chris Callison-Burch | Cameron Fordyce | Philipp Koehn | Christof Monz | Josh Schroeder
Proceedings of the Second Workshop on Statistical Machine Translation
Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System
Loïc Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Second Workshop on Statistical Machine Translation
Loïc Dugast | Jean Senellart | Philipp Koehn
Proceedings of the Second Workshop on Statistical Machine Translation
Experiments in Domain Adaptation for Statistical Machine Translation
Philipp Koehn | Josh Schroeder
Proceedings of the Second Workshop on Statistical Machine Translation
Philipp Koehn | Josh Schroeder
Proceedings of the Second Workshop on Statistical Machine Translation
2006
Statistical machine translation and hybrid machine translation
Philipp Koehn
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Panel on hybrid machine translation: why and how?
Philipp Koehn
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Panel on hybrid machine translation: why and how?
Re-evaluating the Role of Bleu in Machine Translation Research
Chris Callison-Burch | Miles Osborne | Philipp Koehn
11th Conference of the European Chapter of the Association for Computational Linguistics
Chris Callison-Burch | Miles Osborne | Philipp Koehn
11th Conference of the European Chapter of the Association for Computational Linguistics
Improved Statistical Machine Translation Using Paraphrases
Chris Callison-Burch | Philipp Koehn | Miles Osborne
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
Chris Callison-Burch | Philipp Koehn | Miles Osborne
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference
Proceedings on the Workshop on Statistical Machine Translation
Philipp Koehn | Christof Monz
Proceedings on the Workshop on Statistical Machine Translation
Philipp Koehn | Christof Monz
Proceedings on the Workshop on Statistical Machine Translation
Manual and Automatic Evaluation of Machine Translation between European Languages
Philipp Koehn | Christof Monz
Proceedings on the Workshop on Statistical Machine Translation
Philipp Koehn | Christof Monz
Proceedings on the Workshop on Statistical Machine Translation
Constraining the Phrase-Based, Joint Probability Statistical Translation Model
Alexandra Birch | Chris Callison-Burch | Miles Osborne | Philipp Koehn
Proceedings on the Workshop on Statistical Machine Translation
Alexandra Birch | Chris Callison-Burch | Miles Osborne | Philipp Koehn
Proceedings on the Workshop on Statistical Machine Translation
2005
Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation
Philipp Koehn | Amittai Axelrod | Alexandra Birch Mayne | Chris Callison-Burch | Miles Osborne | David Talbot
Proceedings of the Second International Workshop on Spoken Language Translation
Philipp Koehn | Amittai Axelrod | Alexandra Birch Mayne | Chris Callison-Burch | Miles Osborne | David Talbot
Proceedings of the Second International Workshop on Spoken Language Translation
Europarl: A Parallel Corpus for Statistical Machine Translation
Philipp Koehn
Proceedings of Machine Translation Summit X: Papers
Philipp Koehn
Proceedings of Machine Translation Summit X: Papers
We collected a corpus of parallel text in 11 languages from the proceedings of the European Parliament, which are published on the web. This corpus has found widespread use in the NLP community. Here, we focus on its acquisition and its application as training data for statistical machine translation (SMT). We trained SMT systems for 110 language pairs, which reveal interesting clues into the challenges ahead.
Clause Restructuring for Statistical Machine Translation
Michael Collins | Philipp Koehn | Ivona Kučerová
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)
Michael Collins | Philipp Koehn | Ivona Kučerová
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)
Proceedings of the ACL Workshop on Building and Using Parallel Texts
Philipp Koehn | Joel Martin | Rada Mihalcea | Christof Monz | Ted Pedersen
Proceedings of the ACL Workshop on Building and Using Parallel Texts
Philipp Koehn | Joel Martin | Rada Mihalcea | Christof Monz | Ted Pedersen
Proceedings of the ACL Workshop on Building and Using Parallel Texts
Shared Task: Statistical Machine Translation between European Languages
Philipp Koehn | Christof Monz
Proceedings of the ACL Workshop on Building and Using Parallel Texts
Philipp Koehn | Christof Monz
Proceedings of the ACL Workshop on Building and Using Parallel Texts
2004
Introduction to statistical machine translation
Philipp Koehn | Kevin Knight
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Tutorial Descriptions
Philipp Koehn | Kevin Knight
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Tutorial Descriptions
Statistical Significance Tests for Machine Translation Evaluation
Philipp Koehn
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing
Philipp Koehn
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing
2003
Empirical Methods for Compound Splitting
Philipp Koehn | Kevin Knight
10th Conference of the European Chapter of the Association for Computational Linguistics
Philipp Koehn | Kevin Knight
10th Conference of the European Chapter of the Association for Computational Linguistics
Statistical Phrase-Based Translation
Philipp Koehn | Franz J. Och | Daniel Marcu
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics
Philipp Koehn | Franz J. Och | Daniel Marcu
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics
Desparately Seeking Cebuano
Douglas W. Oard | David Doermann | Bonnie Dorr | Daqing He | Philip Resnik | Amy Weinberg | William Byrne | Sanjeev Khudanpur | David Yarowsky | Anton Leuski | Philipp Koehn | Kevin Knight
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers
Douglas W. Oard | David Doermann | Bonnie Dorr | Daqing He | Philip Resnik | Amy Weinberg | William Byrne | Sanjeev Khudanpur | David Yarowsky | Anton Leuski | Philipp Koehn | Kevin Knight
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers
What’s New in Statistical Machine Translation
Kevin Knight | Philipp Koehn
Companion Volume of the Proceedings of HLT-NAACL 2003 - Tutorial Abstracts
Kevin Knight | Philipp Koehn
Companion Volume of the Proceedings of HLT-NAACL 2003 - Tutorial Abstracts
Feature-Rich Statistical Translation of Noun Phrases
Philipp Koehn | Kevin Knight
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics
Philipp Koehn | Kevin Knight
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics
2002
Learning a Translation Lexicon from Monolingual Corpora
Philipp Koehn | Kevin Knight
Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition
Philipp Koehn | Kevin Knight
Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition
2001
Search
Fix author
Co-authors
- Barry Haddow 49
- Christof Monz 44
- Ondřej Bojar 28
- Matthias Huck 28
- Christian Federmann 27
- Matt Post 27
- Chris Callison-Burch 21
- Yvette Graham 18
- Hieu Hoang 18
- Kevin Duh 17
- Rajen Chatterjee 15
- Alexandra Birch 14
- Nadir Durrani 14
- Mark Fishel 14
- Huda Khayrallah 14
- Matteo Negri 14
- Lucia Specia 14
- Christian Buck 13
- Marco Turchi 13
- Francisco Guzmán 12
- Antonio Jimeno Yepes 12
- Tom Kocmi 12
- Vishrav Chaudhary 11
- Kevin Knight 11
- Rebecca Knowles 11
- Markus Freitag 10
- Brian Thompson 10
- Philip Williams 10
- Roman Grundkiewicz 9
- Kenneth Heafield 9
- Kenton Murray 9
- Aurelie Neveol 9
- Mariana Neves 9
- Adithya Renduchintala 9
- Shuoyang Ding 8
- Kelly Marchisio 8
- Maria Nadejde 8
- Masaaki Nagata 8
- Martin Popel 8
- Josh Schroeder 8
- Karin Verspoor 8
- Loic Barrault 7
- Francisco Casacuberta 7
- Marcello Federico 7
- Makoto Morishita 7
- Weiting Tan 7
- Antonios Anastasopoulos 6
- Marta R. Costa-jussà 6
- Jason Eisner 6
- Alexander Fraser 6
- Eva Hasler 6
- André F. T. Martins 6
- Toshiaki Nakazawa 6
- Miles Osborne 6
- Herve Saint-Amand 6
- Jean Senellart 6
- Haoran Xu 6
- Abhishek Arun 5
- Eleftherios Avramidis 5
- Michael Carl 5
- James Cross 5
- Ulrich Germann 5
- Thamme Gowda 5
- Juan Pino 5
- Radu Soricut 5
- Marcos Zampieri 5
- Rachel Bawden 4
- Nicola Bertoldi 4
- Yunmo Chen 4
- Loic Dugast 4
- Benjamin Van Durme 4
- Anton Dvorkovich 4
- Ahmed El-Kishky 4
- Angela Fan 4
- Cameron Shaw Fordyce 4
- Gaurav Kumar 4
- Adam Lopez 4
- Xutai Ma 4
- Maja Popović 4
- Holger Schwenk 4
- Rico Sennrich 4
- Mariya Shmatova 4
- Hainan Xu 4
- Omar Zaidan 4
- Bismarck Bamfo Odoom 3
- Fethi Bougares 3
- Mauro Cettolo 3
- Paco Guzman 3
- Marcin Junczys-Dowmunt 3
- Sanjeev Khudanpur 3
- Varvara Logacheva 3
- Jean Maillard 3
- Graham Neubig 3
- Hermann Ney 3
- Santanu Pal 3
- Pavel Pecina 3
- Stephan Peitz 3
- Hassan Sajjad 3
- Germán Sanchis-Trilles 3
- Helmut Schmid 3
- Shuming Shi 3
- Alex Waibel 3
- Matthew Wiesner 3
- Joern Wuebker 3
- Vilém Zouhar 3
- Vicent Alabau 2
- Ekaterina Artemova 2
- Yonatan Belinkov 2
- Shruti Bhosale 2
- Magdalena Biesialska 2
- Phil Blunsom 2
- Nikolay Bogoychev 2
- Laurie Burchell 2
- Alessandro Cattelan 2
- Pinzhen Chen 2
- Peng-Jen Chen 2
- Weiyu Chen 2
- Tongfei Chen 2
- Eunah Cho 2
- Trevor Cohn 2
- Michael Collins 2
- Mona Diab 2
- Chris Dyer 2
- Yukun Feng 2
- Mikel L. Forcada 2
- George Foster 2
- Cynthia Gao 2
- Mercedes García-Martínez 2
- Naman Goyal 2
- Yan Gu 2
- Liane Guillou 2
- Jeremy Gwinnup 2
- Teresa Herrmann 2
- Robin L. Hill 2
- Guoping Huang 2
- Amir Kamran 2
- Marzena Karpinska 2
- Daniel Khashabi 2
- Geza Kovacs 2
- Julia Kreutzer 2
- Yash Kumar Lal 2
- Alon Lavie 2
- Luis A. Leiva 2
- Xian Li 2
- Feng Li 2
- Shuyue Stella Li 2
- Lemao Liu 2
- Siyou Liu 2
- Chenyang Lyu 2
- Eva Marcos 2
- Benjamin Marie 2
- Rebecca Marvin 2
- Paul McNamee 2
- Mohammed Mediani 2
- Bartolomé Mesa-Lao 2
- Paul Michel 2
- Rada Mihalcea 2
- Jan Niehues 2
- Stefano Perrella 2
- Kay Peterson 2
- Carey Priebe 2
- Lorenzo Proietti 2
- Parker Riley 2
- Raphael Rubino 2
- Ali Saad-Eldin 2
- Elizabeth Salesky 2
- Carolina Scarton 2
- Lingfeng Shen 2
- Steinþór Steingrímsson 2
- Jörg Tiedemann 2
- Chau Tran 2
- Chara Tsoukala 2
- Zhaopeng Tu 2
- Neha Verma 2
- Longyue Wang 2
- Taro Watanabe 2
- Andy Way 2
- Conghao Xiong 2
- David Yarowsky 2
- Yulin Yuan 2
- Xuan Zhang 2
- Boyuan Zheng 2
- Liting Zhou 2
- Chengqing Zong 2
- Idris Abdulmumin 1
- Sweta Agrawal 1
- Farhad Akhbardeh 1
- Md Mahfuz Ibn Alam 1
- Anton Alyakin 1
- Kwabena Amponsah-Kaakyire 1
- Tim Anderson 1
- Arkady Arkhangorodsky 1
- Michael Auli 1
- Amittai Axelrod 1
- Niyati Bafna 1
- Laura Banarescu 1
- Marta Bañón 1
- Peter Bell 1
- Laurent Besacier 1
- Alexandra Birch Mayne 1
- Frédéric Blain 1
- Claire Bonial 1
- Ragnar Bonk 1
- Eleftheria Briakou 1
- Deana Burchfield 1
- Bill Byrne 1
- Shu Cai 1
- Annabelle Carrell 1
- Isaac Caswell 1
- Julianne Chaloux 1
- Sihao Chen 1
- Jason Chuang 1
- Jonathan H. Clark 1
- Alex Comerford 1
- Alexandra Constantin 1
- Cash Costello 1
- Brooke Cowan 1
- David Dale 1
- Mrinal Dhar 1
- Georgiana Dinu 1
- David Doermann 1
- Tobias Domhan 1
- Bonnie Dorr 1
- Zi-Yi Dou 1
- Konstantin Dranch 1
- Mark Dredze 1
- Sergey Dukanov 1
- Tomasz Dwojak 1
- Denise Díaz 1
- Sergey Edunov 1
- Cristina España-Bonet 1
- Miquel Esplà-Gomis 1
- Marzieh Fadaee 1
- Antonio Farina 1
- Tim Finin 1
- Orhan Firat 1
- Patrick Foley 1
- Pascale Fung 1
- Matthias Gallé 1
- Yang Gao (扬 高) 1
- Paola Leibny Garcia 1
- Leibny Paola Garcia 1
- Dmitriy Genzel 1
- Madalina Georgescu 1
- Arnab Ghoshal 1
- Benjamin Glass 1
- Jesús González 1
- Jesús González-Rubio 1
- Vedanuj Goswami 1
- Spence Green 1
- Kira Griffitt 1
- Biman Gujral 1
- Jialiang Guo 1
- Thanh-Le Ha 1
- Gholamreza Haffari 1
- Prangthip Hansanti 1
- Shudong Hao 1
- Craig Harman 1
- Leonie Harter 1
- Daqing He 1
- Marti A. Hearst 1
- Jeffrey Heer 1
- Kevin Heffernan 1
- Evan Herbst 1
- Ulf Hermjakob 1
- Sorami Hisamoto 1
- Vu Cong Duy Hoang 1
- Chris Hokamp 1
- Christopher Homan 1
- Yupeng Hou 1
- Junjie Hu 1
- Shujian Huang (书剑 黄) 1
- Macduff Hughes 1
- Hirofumi Inaguma 1
- Wenxiang Jiao 1
- Eric Joanis 1
- Shafiq Joty 1
- Kweonwoo Jung 1
- Elahe Kalbassi 1
- Jungo Kasai 1
- Ankur Kejriwal 1
- Faheem Kirefu 1
- Ahmed Kishky 1
- Tetsuo Kiso 1
- Wei-Jen Ko 1
- Sachith Sri Ram Kothur 1
- Vaibhav Kumar 1
- Ivona Kučerová 1
- Ivana Kvapilíková 1
- Felicia Körner 1
- Cheng-I Lai 1
- Howard Lakougna 1
- Janice Lam 1
- Guillaume Lample 1
- Dawn Lawrie 1
- Rosie Lazar 1
- Ronan Le Nagard 1
- Anton Leuski 1
- Johannes Leveling 1
- Will Lewis 1
- Ke Li 1
- Zhenhao Li 1
- Shuyue Li 1
- Renhao Li 1
- Jiachen Lian 1
- Daniel Licht 1
- Tom Lippincott 1
- Chao-Hong Liu 1
- Qun Liu 1
- Nikola Ljubešić 1
- Chi-kiu Lo 1
- Nicholas Lourie 1
- TaiMing Lu 1
- Jessica Lundin 1
- Domenico Lupinetti 1
- Qingsong Ma 1
- Yufeng Ma 1
- Shervin Malmasi 1
- Saab Mansour 1
- Daniel Marcu 1
- M. Patrick Martin 1
- Joel Martin 1
- Andrea Martines 1
- Alberto Massidda 1
- Chandler May 1
- James Mayfield 1
- Arya D. McCarthy 1
- Fergus McInnes 1
- Chutong Meng 1
- Scott Miller 1
- Muhammad Tasnim Mohiuddin 1
- Christine Moran 1
- Alex Mourachko 1
- Mathias Müller 1
- Ajay Nagesh 1
- Vassilina Nikoulina 1
- Mengmeng Niu 1
- Michal Novák 1
- Douglas W. Oard 1
- Franz Josef Och 1
- Daniel Oriz 1
- John Ortega 1
- Sergio Ortiz Rojas 1
- Daniel Ortiz-Martínez 1
- Myle Ott 1
- Sharon O’Brien 1
- Martha Palmer 1
- Eric Paquin 1
- Youngser Park 1
- Ted Pedersen 1
- Magdalena Plamadă 1
- Adam Poliak 1
- Ivan Pouzyrevsky 1
- Daniel Povey 1
- Mark Przybocki 1
- Guanghui Qin 1
- Muhammad Rahman 1
- Navid Rajabi 1
- Gema Ramírez-Sánchez 1
- Marc’Aurelio Ranzato 1
- Pushpendre Rastogi 1
- Siva Reddy 1
- Steve Renals 1
- Philip Resnik 1
- Elijah Rippeth 1
- Nathaniel Robinson 1
- Christophe Ropers 1
- Kaushik Ram Sadagopan 1
- Rashmi Sankepally 1
- Elsa Sarrías 1
- Patrícia Schmidtová 1
- Nathan Schneider 1
- Hinrich Schütze 1
- Leopoldo Pla Sempere 1
- Pamela Shapiro 1
- Wade Shen 1
- Manish Shrivastava 1
- Isabel Slawik 1
- Steve Sloto 1
- Jason Smith 1
- Ziang Song 1
- Miloš Stanojević 1
- Ralf Steinberger 1
- Marek Strelec 1
- Simeng Sun 1
- Jun Suzuki 1
- Pawel Swietojanski 1
- Eduardo Sánchez 1
- Marina Sánchez-Torrón 1
- David Talbot 1
- Aleš Tamchyna 1
- Grace Tang 1
- Yuqing Tang 1
- Allahsera Auguste Tapo 1
- Luis Tavarez-Arce 1
- Max Thomas 1
- Paden Tomasello 1
- Ying-Ying Tran 1
- Marco Trombetti 1
- Chara Tsiukala 1
- Sylwia Tur 1
- Valentin Vydrin 1
- William Waites 1
- Skyler Wang 1
- Xing Wang 1
- Songsheng Wang 1
- Chao Wang (王超) 1
- Yiming Wang 1
- Bonnie Webber 1
- Amy Weinberg 1
- Rachel Wicks 1
- Dion Wiggins 1
- Travis Wolfe 1
- Derek F. Wong (黄辉) 1
- Minghao Wu 1
- Jiahao Xu 1
- Wenduan Xu 1
- Min Yang 1
- Lisa Yankovskaya 1
- Mahsa Yarmohammadi 1
- Dian Yu 1
- Jaume Zaragoza 1
- Richard Zens 1
- Jingyu Zhang 1
- Shijie Zhang 1
- Chenyu Zhang 1
- Ted Zhang 1
- Yue Zhang 1
- Alp Öktem 1