2012
pdf
A Computer Assisted Speech Transcription System
Alejandro Revuelta-Martínez
|
Luis Rodríguez
|
Ismael García-Varea
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
2011
pdf
An Interactive Machine Translation System with Online Learning
Daniel Ortiz-Martínez
|
Luis A. Leiva
|
Vicent Alabau
|
Ismael García-Varea
|
Francisco Casacuberta
Proceedings of the ACL-HLT 2011 System Demonstrations
2010
pdf
Online Learning for Interactive Statistical Machine Translation
Daniel Ortiz-Martínez
|
Ismael García-Varea
|
Francisco Casacuberta
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
2009
pdf
Interactive Machine Translation Based on Partial Statistical Phrase-based Alignments
Daniel Ortiz-Martínez
|
Ismael García-Varea
|
Francisco Casacuberta
Proceedings of the International Conference RANLP-2009
2008
pdf
Phrase-level alignment generation using a smoothed loglinear phrase-based statistical alignment model
Daniel Ortiz-Martínez
|
Ismael García-Varea
|
Francisco Casacuberta
Proceedings of the 12th Annual Conference of the European Association for Machine Translation
2007
pdf
bib
Combining translation models in statistical machine translation
Jesús Andrés-Ferrer
|
Ismael Garcia-Varea
|
Francisco Casacuberta
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers
2006
pdf
Searching for alignments in SMT. A novel approach based on an Estimation of Distribution Algorithm
Luis Rodríguez
|
Ismael García-Varea
|
José A. Gámez
Proceedings on the Workshop on Statistical Machine Translation
pdf
Generalized Stack Decoding Algorithms for Statistical Machine Translation
Daniel Ortiz Martínez
|
Ismael García Varea
|
Francisco Casacuberta
Proceedings on the Workshop on Statistical Machine Translation
2005
pdf
abs
Thot: a Toolkit To Train Phrase-based Statistical Translation Models
Daniel Ortiz-Martínez
|
Ismael García-Varea
|
Francisco Casacuberta
Proceedings of Machine Translation Summit X: Papers
In this paper, we present the Thot toolkit, a set of tools to train phrase-based models for statistical machine translation, which is publicly available as open source software. The toolkit obtains phrase-based models from word-based alignment models; to our knowledge, this functionality has not been offered by any publicly available toolkit. The Thot toolkit also implements a new way for estimating phrase models, this allows to obtain more complete phrase models than the methods described in the literature, including a segmentation length submodel. The toolkit output can be given in different formats in order to be used by other statistical machine translation tools like Pharaoh, which is a beam search decoder for phrase-based alignment models which was used in order to perform translation experiments with the generated models. Additionally, the Thot toolkit can be used to obtain the best alignment between a sentence pair at phrase level.
2003
pdf
abs
On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA)
Daniel Ortíz
|
Ismael García-Varea
|
Francisco Casacuberta
|
Antonio Lagarda
|
Jorge González
Proceedings of Machine Translation Summit IX: Papers
The goal of the AMETRA project is to make a computer-assisted translation tool from the Spanish language to the Basque language under the memory-based translation framework. The system is based on a large collection of bilingual word-segments. These segments are obtained using linguistic or statistical techniques from a Spanish-Basque bilingual corpus consisting of sentences extracted from the Basque Country’s of£cial government record. One of the tasks within the global information document of the AMETRA project is to study the combination of well-known statistical techniques for the translation of short sequences and techniques for memory-based translation. In this paper, we address the problem of constructing a statistical module to deal with the task of translating segments. The task undertaken in the AMETRA project is compared with other existing translation tasks, This study includes the results of some preliminary experiments we have carried out using well-known statistical machine translation tools and techniques.
2002
pdf
Improving Alignment Quality in Statistical Machine Translation Using Context-dependent Maximum Entropy Models
Ismael García Varea
|
Franz J. Och
|
Hermann Ney
|
Francisco Casacuberta
COLING 2002: The 19th International Conference on Computational Linguistics
pdf
abs
Efficient integration of maximum entropy lexicon models within the training of statistical alignment models
Ismael García-Varea
|
Franz J. Och
|
Hermann Ney
|
Francisco Casacuberta
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers
Maximum entropy (ME) models have been successfully applied to many natural language problems. In this paper, we show how to integrate ME models efficiently within a maximum likelihood training scheme of statistical machine translation models. Specifically, we define a set of context-dependent ME lexicon models and we present how to perform an efficient training of these ME models within the conventional expectation-maximization (EM) training of statistical translation models. Experimental results are also given in order to demonstrate how these ME models improve the results obtained with the traditional translation models. The results are presented by means of alignment quality comparing the resulting alignments with manually annotated reference alignments.
2001
pdf
abs
Search algorithms for statistical machine translation based on dynamic programming and pruning techniques
Ismael García-Varea
|
Francisco Casacuberta
Proceedings of Machine Translation Summit VIII
The increasing interest in the statistical approach to Machine Translation is due to the development of effective algorithms for training the probabilistic models proposed so far. However, one of the open problems with statistical machine translation is the design of efficient algorithms for translating a given input string. For some interesting models, only (good) approximate solutions can be found. Recently, a dynamic programming-like algorithm for the IBM-Model 2 has been proposed which is based on an iterative process of refinement solutions. A new dynamic programming-like algorithm is proposed here to deal with more complex IBM models (models 3 to 5). The computational cost of the algorithm is reduced by using an alignment-based pruning technique. Experimental results with the so-called “Tourist Task” are also presented.
pdf
Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach
Ismael García-Varea
|
Franz J. Och
|
Hermann Ney
|
Francisco Casacuberta
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics