Nuno Rendeiro


2016

pdf
SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task
Rosa Gaudio | Gorka Labaka | Eneko Agirre | Petya Osenova | Kiril Simov | Martin Popel | Dieke Oele | Gertjan van Noord | Luís Gomes | João António Rodrigues | Steven Neale | João Silva | Andreia Querido | Nuno Rendeiro | António Branco
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf
Use of Domain-Specific Language Resources in Machine Translation
Sanja Štajner | Andreia Querido | Nuno Rendeiro | João António Rodrigues | António Branco
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we address the problem of Machine Translation (MT) for a specialised domain in a language pair for which only a very small domain-specific parallel corpus is available. We conduct a series of experiments using a purely phrase-based SMT (PBSMT) system and a hybrid MT system (TectoMT), testing three different strategies to overcome the problem of the small amount of in-domain training data. Our results show that adding a small size in-domain bilingual terminology to the small in-domain training corpus leads to the best improvements of a hybrid MT system, while the PBSMT system achieves the best results by adding a combination of in-domain bilingual terminology and a larger out-of-domain corpus. We focus on qualitative human evaluation of the output of two best systems (one for each approach) and perform a systematic in-depth error analysis which revealed advantages of the hybrid MT system over the pure PBSMT system for this specific task.

pdf
Bootstrapping a Hybrid MT System to a New Language Pair
João António Rodrigues | Nuno Rendeiro | Andreia Querido | Sanja Štajner | António Branco
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The usual concern when opting for a rule-based or a hybrid machine translation (MT) system is how much effort is required to adapt the system to a different language pair or a new domain. In this paper, we describe a way of adapting an existing hybrid MT system to a new language pair, and show that such a system can outperform a standard phrase-based statistical machine translation system with an average of 10 persons/month of work. This is specifically important in the case of domain-specific MT for which there is not enough parallel data for training a statistical machine translation system.