Adrià de Gispert

Also published as: Adria de Gispert, Adrià De Gispert, Adrià Gispert

2019

pdf bib abs
Domain Adaptive Inference for Neural Machine Translation
Danielle Saunders | Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We investigate adaptive ensemble weighting for Neural Machine Translation, addressing the case of improving performance on a new and potentially unknown domain without sacrificing performance on the original domain. We adapt sequentially across two Spanish-English and three English-German tasks, comparing unregularized fine-tuning, L2 and Elastic Weight Consolidation. We then report a novel scheme for adaptive NMT ensemble decoding by extending Bayesian Interpolation with source information, and report strong improvements across test domains without access to the domain label.

pdf bib abs
CUED@WMT19:EWC&LMs
Felix Stahlberg | Danielle Saunders | Adrià de Gispert | Bill Byrne
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Two techniques provide the fabric of the Cambridge University Engineering Department’s (CUED) entry to the WMT19 evaluation campaign: elastic weight consolidation (EWC) and different forms of language modelling (LMs). We report substantial gains by fine-tuning very strong baselines on former WMT test sets using a combination of checkpoint averaging and EWC. A sentence-level Transformer LM and a document-level LM based on a modified Transformer architecture yield further gains. As in previous years, we also extract n-gram probabilities from SMT lattices which can be seen as a source-conditioned n-gram LM.

pdf bib abs
Controlling Japanese Honorifics in English-to-Japanese Neural Machine Translation
Weston Feely | Eva Hasler | Adrià de Gispert
Proceedings of the 6th Workshop on Asian Translation

In the Japanese language different levels of honorific speech are used to convey respect, deference, humility, formality and social distance. In this paper, we present a method for controlling the level of formality of Japanese output in English-to-Japanese neural machine translation (NMT). By using heuristics to identify honorific verb forms, we classify Japanese sentences as being one of three levels of informal, polite, or formal speech in parallel text. The English source side is marked with a feature that identifies the level of honorific speech present in the Japanese target side. We use this parallel text to train an English-Japanese NMT model capable of producing Japanese translations in different honorific speech styles for the same English input sentence.

2018

pdf bib abs
Multi-representation ensembles and delayed SGD updates improve syntax-based NMT
Danielle Saunders | Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We explore strategies for incorporating target syntax into Neural Machine Translation. We specifically focus on syntax in ensembles containing multiple sentence representations. We formulate beam search over such ensembles using WFSTs, and describe a delayed SGD update training procedure that is especially effective for long representations like linearized syntax. Our approach gives state-of-the-art performance on a difficult Japanese-English task.

pdf bib abs
Neural Machine Translation Decoding with Terminology Constraints
Eva Hasler | Adrià de Gispert | Gonzalo Iglesias | Bill Byrne
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Despite the impressive quality improvements yielded by neural machine translation (NMT) systems, controlling their translation output to adhere to user-provided terminology constraints remains an open problem. We describe our approach to constrained neural decoding based on finite-state machines and multi-stack decoding which supports target-side constraints as well as constraints with corresponding aligned input text spans. We demonstrate the performance of our framework on multiple translation tasks and motivate the need for constrained decoding with attentions as a means of reducing misplacement and duplication when translating user constraints.

pdf bib abs
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
Gonzalo Iglesias | William Tambellini | Adrià De Gispert | Eva Hasler | Bill Byrne
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

We describe a batched beam decoding algorithm for NMT with LMBR n-gram posteriors, showing that LMBR techniques still yield gains on top of the best recently reported results with Transformers. We also discuss acceleration strategies for deployment, and the effect of the beam size and batching on memory and speed.

pdf bib
Turning NMT Research into Commercial Products
Dragos Munteanu | Adrià Gispert
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)

pdf bib abs
The University of Cambridge’s Machine Translation Systems for WMT18
Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

The University of Cambridge submission to the WMT18 news translation task focuses on the combination of diverse models of translation. We compare recurrent, convolutional, and self-attention-based neural models on German-English, English-German, and Chinese-English. Our final system combines all neural models together with a phrase-based SMT system in an MBR-based scheme. We report small but consistent gains on top of strong Transformer ensembles.

2017

pdf bib abs
A Comparison of Neural Models for Word Ordering
Eva Hasler | Felix Stahlberg | Marcus Tomalin | Adrià de Gispert | Bill Byrne
Proceedings of the 10th International Conference on Natural Language Generation

We compare several language models for the word-ordering task and propose a new bag-to-sequence neural model based on attention-based sequence-to-sequence models. We evaluate the model on a large German WMT data set where it significantly outperforms existing models. We also describe a novel search strategy for LM-based word ordering and report results on the English Penn Treebank. Our best model setup outperforms prior work both in terms of speed and quality.

pdf bib abs
Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices
Felix Stahlberg | Adrià de Gispert | Eva Hasler | Bill Byrne
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT). Our approach borrows ideas from linearised lattice minimum Bayes-risk decoding for SMT. The NMT score is combined with the Bayes-risk of the translation according the SMT lattice. This makes our approach much more flexible than n-best list or lattice rescoring as the neural decoder is not restricted to the SMT search space. We show an efficient and simple way to integrate risk estimation into the NMT decoder which is suitable for word-level as well as subword-unit-level NMT. We test our method on English-German and Japanese-English and report significant gains over lattice rescoring on several data sets for both single and ensembled NMT. The MBR decoder produces entirely new hypotheses far beyond simply rescoring the SMT search space or fixing UNKs in the NMT output.

2013

2012

pdf bib
Can Automatic Post-Editing Make MT More Meaningful
Kristen Parton | Nizar Habash | Kathleen McKeown | Gonzalo Iglesias | Adrià de Gispert
Proceedings of the 16th Annual conference of the European Association for Machine Translation

2011

pdf bib
Hierarchical Phrase-based Translation Representations
Gonzalo Iglesias | Cyril Allauzen | William Byrne | Adrià de Gispert | Michael Riley
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities
Adrià de Gispert | Juan Pino | William Byrne
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
Graeme Blackwood | Adrià de Gispert | William Byrne
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Hierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-n Grammars
Adrià de Gispert | Gonzalo Iglesias | Graeme Blackwood | Eduardo R. Banga | William Byrne
Computational Linguistics, Volume 36, Issue 3 - September 2010

pdf bib
Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
Graeme Blackwood | Adrià de Gispert | William Byrne
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
The CUED HiFST System for the WMT10 Translation Shared Task
Juan Pino | Gonzalo Iglesias | Adrià de Gispert | Graeme Blackwood | Jamie Brunning | William Byrne
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

2009

pdf bib
Context-Dependent Alignment Models for Statistical Machine Translation
Jamie Brunning | Adrià de Gispert | William Byrne
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Hierarchical Phrase-Based Translation with Weighted Finite State Transducers
Gonzalo Iglesias | Adrià de Gispert | Eduardo R. Banga | William Byrne
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Minimum Bayes Risk Combination of Translation Hypotheses from Alternative Morphological Decompositions
Adrià de Gispert | Sami Virpioja | Mikko Kurimo | William Byrne
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
Rule Filtering by Pattern for Efficient Hierarchical Translation
Gonzalo Iglesias | Adrià de Gispert | Eduardo R. Banga | William Byrne
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
European Language Translation with Weighted Finite State Transducers: The CUED MT System for the 2008 ACL Workshop on SMT
Graeme Blackwood | Adrià de Gispert | Jamie Brunning | William Byrne
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Phrasal Segmentation Models for Statistical Machine Translation
Graeme Blackwood | Adrià de Gispert | William Byrne
Coling 2008: Companion volume: Posters

2006

2005

pdf bib
Statistical Machine Translation of Euparl Data by using Bilingual N-grams
Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José B. Mariño
Proceedings of the ACL Workshop on Building and Using Parallel Texts

pdf bib
Phrase Linguistic Classification and Generalization for Improving Statistical Machine Translation
Adrià de Gispert
Proceedings of the ACL Student Research Workshop

pdf bib
The TALP Ngram-based SMT System for IWSLT’05
Josep M. Crego | Adria de Gispert | Jose B. Marino
Proceedings of the Second International Workshop on Spoken Language Translation

This paper describes a statistical machine translation system that uses a translation model which is based on bilingual n-grams. When this translation model is log-linearly combined with four specific feature functions, state of the art translations are achieved for Spanish-to-English and English-to-Spanish translation tasks. Some specific results obtained for the EPPS (European Parliament Plenary Sessions) data are presented and discussed. Finally, future research issues are depicted.

pdf bib abs
Reordered Search, and Tuple Unfolding for Ngram-based SMT
Josep M. Crego | José B. Mariño | Adrià de Gispert
Proceedings of Machine Translation Summit X: Papers

In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfolding. These strategies interact to improve translation quality in a Chinese to English task. On the one hand, we allow for an Ngram-based decoder (MARIE) to perform a reordered search over the source sentence, while combining a translation tuples Ngram model, a target language model, a word penalty and a word distance model. Interestingly, even though the translation units are learnt sequentially, its reordered search produces an improved translation. On the other hand, we allow for a modification of the translation units that unfolds the tuples, so that shorter units are learnt from a new parallel corpus, where the source sentences are reordered according to the target language. This tuple unfolding technique reduces data sparseness and, when combined with the reordered search, further boosts translation performance. Translation accuracy and efficency results are reported for the IWSLT 2004 Chinese to English task.