Rafael E. Banchs

Also published as: Rafael Banchs


2018

pdf bib
Proceedings of the Seventh Named Entities Workshop
Nancy Chen | Rafael E. Banchs | Xiangyu Duan | Min Zhang | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

pdf
Attention-based Semantic Priming for Slot-filling
Jiewen Wu | Rafael E. Banchs | Luis Fernando D’Haro | Pavitra Krishnaswamy | Nancy Chen
Proceedings of the Seventh Named Entities Workshop

The problem of sequence labelling in language understanding would benefit from approaches inspired by semantic priming phenomena. We propose that an attention-based RNN architecture can be used to simulate semantic priming for sequence labelling. Specifically, we employ pre-trained word embeddings to characterize the semantic relationship between utterances and labels. We validate the approach using varying sizes of the ATIS and MEDIA datasets, and show up to 1.4-1.9% improvement in F1 score. The developed framework can enable more explainable and generalizable spoken language understanding systems.

pdf
NEWS 2018 Whitepaper
Nancy Chen | Xiangyu Duan | Min Zhang | Rafael E. Banchs | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

Transliteration is defined as phonetic translation of names across languages. Transliteration of Named Entities (NEs) is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. All such systems call for high-performance transliteration, which is the focus of shared task in the NEWS 2018 workshop. The objective of the shared task is to promote machine transliteration research by providing a common benchmarking platform for the community to evaluate the state-of-the-art technologies.

pdf
Report of NEWS 2018 Named Entity Transliteration Shared Task
Nancy Chen | Rafael E. Banchs | Min Zhang | Xiangyu Duan | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

This report presents the results from the Named Entity Transliteration Shared Task conducted as part of The Seventh Named Entities Workshop (NEWS 2018) held at ACL 2018 in Melbourne, Australia. Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts. A total of 6 teams from 8 different institutions participated in the evaluation, submitting 424 runs, involving different transliteration methodologies. Four performance metrics were used to report the evaluation results. The NEWS shared task on machine transliteration has successfully achieved its objectives by providing a common ground for the research community to conduct comparative evaluations of state-of-the-art technologies that will benefit the future research and development in this area.

2016

pdf
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling for Dialogue Topic Tracking
Seokhwan Kim | Rafael Banchs | Haizhou Li
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)


Continuous Vector Spaces for Cross-language NLP Applications
Rafael E. Banchs
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

The mathematical metaphor offered by the geometric concept of distance in vector spaces with respect to semantics and meaning has been proven to be useful in many monolingual natural language processing applications. There is also some recent and strong evidence that this paradigm can also be useful in the cross-language setting. In this tutorial, we present and discuss some of the most recent advances on exploiting the vector space model paradigm in specific cross-language natural language processing applications, along with a comprehensive review of the theoretical background behind them.First, the tutorial introduces some fundamental concepts of distributional semantics and vector space models. More specifically, the concepts of distributional hypothesis and term-document matrices are revised, followed by a brief discussion on linear and non-linear dimensionality reduction techniques and their implications to the parallel distributed approach to semantic cognition. Next, some classical examples of using vector space models in monolingual natural language processing applications are presented. Specific examples in the areas of information retrieval, related term identification and semantic compositionality are described.Then, the tutorial focuses its attention on the use of the vector space model paradigm in cross-language applications. To this end, some recent examples are presented and discussed in detail, addressing the specific problems of cross-language information retrieval, cross-language sentence matching, and machine translation. Some of the most recent developments in the area of Neural Machine Translation are also discussed.Finally, the tutorial concludes with a discussion about current and future research problems related to the use of vector space models in cross-language settings. Future avenues for scientific research are described, with major emphasis on the extension from vector and matrix representations to tensors, as well as the problem of encoding word position information into the vector-based representations.

pdf
A Report on the Automatic Evaluation of Scientific Writing Shared Task
Vidas Daudaravicius | Rafael E. Banchs | Elena Volodina | Courtney Napoles
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Proceedings of the Sixth Named Entity Workshop
Xiangyu Duan | Rafael E. Banchs | Min Zhang | Haizhou Li | A Kumaran
Proceedings of the Sixth Named Entity Workshop

pdf
Evaluating and Combining Name Entity Recognition Systems
Ridong Jiang | Rafael E. Banchs | Haizhou Li
Proceedings of the Sixth Named Entity Workshop

pdf
Whitepaper of NEWS 2016 Shared Task on Machine Transliteration
Xiangyu Duan | Min Zhang | Haizhou Li | Rafael Banchs | A Kumaran
Proceedings of the Sixth Named Entity Workshop

pdf
Report of NEWS 2016 Machine Transliteration Shared Task
Xiangyu Duan | Rafael Banchs | Min Zhang | Haizhou Li | A. Kumaran
Proceedings of the Sixth Named Entity Workshop

pdf bib
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)
Patrik Lambert | Bogdan Babych | Kurt Eberle | Rafael E. Banchs | Reinhard Rapp | Marta R. Costa-jussà
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)

2015

pdf
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constraints from Wikipedia
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
RevUP: Automatic Gap-Fill Question Generation from Educational Texts
Girish Kumar | Rafael Banchs | Luis Fernando D’Haro
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Proceedings of the Fifth Named Entity Workshop
Xiangyu Duan | Rafael E. Banchs | Min Zhang | Haizhou Li | A Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Whitepaper of NEWS 2015 Shared Task on Machine Transliteration
Min Zhang | Haizhou Li | Rafael E. Banchs | A Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Report of NEWS 2015 Machine Transliteration Shared Task
Rafael E. Banchs | Min Zhang | Xiangyu Duan | Haizhou Li | A. Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)
Bogdan Babych | Kurt Eberle | Patrik Lambert | Reinhard Rapp | Rafael E. Banchs | Marta R. Costa-jussà
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)

pdf
Automated Simultaneous Interpretation: Hints of a Cognitive Framework for Machine Translation
Rafael E. Banchs
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)

pdf
Towards Improving Dialogue Topic Tracking Performances with Wikification of Concept Mentions
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain Knowledge from Wikipedia
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)
Rafael E. Banchs | Marta R. Costa-jussà | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Bogdan Babych
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

pdf
A Principled Approach to Context-Aware Machine Translation
Rafael E. Banchs
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

pdf
English-to-Hindi system description for WMT 2014: Deep Source-Context Features for Moses
Marta R. Costa-jussà | Parth Gupta | Paolo Rosso | Rafael E. Banchs
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf
Sequential Labeling for Tracking Dynamic Dialog States
Seokhwan Kim | Rafael E. Banchs
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf
CHISPA on the GO: A mobile Chinese-Spanish translation service for travellers in trouble
Jordi Centelles | Marta R. Costa-jussà | Rafael E. Banchs
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

pdf
Modeling of term-distance and term-occurrence information for improving n-gram language model performance
Tze Yuang Chong | Rafael E. Banchs | Eng Siong Chng | Haizhou Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of the Second Workshop on Hybrid Approaches to Translation
Marta Ruiz Costa-jussà | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Rafael E. Banchs | Bogdan Babych
Proceedings of the Second Workshop on Hybrid Approaches to Translation

pdf bib
Workshop on Hybrid Approaches to Translation: Overview and Developments
Marta R. Costa-jussà | Rafael Banchs | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Bogdan Babych
Proceedings of the Second Workshop on Hybrid Approaches to Translation

pdf bib
Meaning Unit Segmentation in English and Chinese: a New Approach to Discourse Phenomena
Jennifer Williams | Rafael Banchs | Haizhou Li
Proceedings of the Workshop on Discourse in Machine Translation

pdf
AIDA: Artificial Intelligent Dialogue Agent
Rafael E. Banchs | Ridong Jiang | Seokhwan Kim | Arthur Niswar | Kheng Hui Yeo
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
Marta R. Costa-jussà | Patrik Lambert | Rafael E. Banchs | Reinhard Rapp | Bogdan Babych
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

pdf
An Empirical Evaluation of Stop Word Removal in Statistical Machine Translation
Tze Yuang Chong | Rafael Banchs | Eng Siong Chng
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

pdf bib
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Rafael E. Banchs
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries

pdf
Movie-DiC: a Movie Dialogue Corpus for Research and Development
Rafael E. Banchs
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
IRIS: a Chat-oriented Dialogue System based on the Vector Space Model
Rafael E. Banchs | Haizhou Li
Proceedings of the ACL 2012 System Demonstrations

2011

pdf
Deriving translation units using small additional corpora
Carlos A. Henríquez Q. | José B. Mariño | Rafael E. Banchs
Proceedings of the 15th Annual Conference of the European Association for Machine Translation


Deriving translation units using small additional corpora
Carlos A. Henríquez Q. | José B. Mariño | Rafael E. Banchs
Proceedings of the 15th Annual Conference of the European Association for Machine Translation

pdf
AM-FM: A Semantic Framework for Translation Quality Assessment
Rafael E. Banchs | Haizhou Li
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Enhancing scarce-resource language translation through pivot combinations
Marta R. Costa-jussà | Carlos Henríquez | Rafael E. Banchs
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
A Semantic Feature for Statistical Machine Translation
Rafael E. Banchs | Marta R. Costa-jussà
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf
The BM-I2R Haitian-Créole-to-English translation system description for the WMT 2011 evaluation campaign
Marta R. Costa-jussà | Rafael E. Banchs
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf
Comparative Evaluation of Spanish Segmentation Strategies for Spanish-Chinese Transliteration
Rafael E. Banchs
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

2010

pdf
Opinion Mining of Spanish Customer Comments with Non-Expert Annotations on Mechanical Turk
Bart Mellebeek | Francesc Benavent | Jens Grivolla | Joan Codina | Marta R. Costa-jussà | Rafael Banchs
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk

pdf
Using Collocation Segmentation to Augment the Phrase Table
Carlos A. Henríquez Q. | Marta Ruiz Costa-jussà | Vidas Daudaravicius | Rafael E. Banchs | José B. Mariño
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf
Integration of statistical collocation segmentations in a phrase-based statistical machine translation system
Marta R. Costa-jussa | Vidas Daudaravicius | Rafael E. Banchs
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

pdf
I2R’s machine translation system for IWSLT 2010
Xiangyu Duan | Rafael Banchs | Jun Lang | Deyi Xiong | Aiti Aw | Min Zhang | Haizhou Li
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

pdf
UPC-BMIC-VDU system description for the IWSLT 2010: testing several collocation segmentations in a phrase-based SMT system
Carlos Henríquez | Marta R. Costa-jussà | Vidas Daudaravicius | Rafael E. Banchs | José B. Mariño
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the UPC-BMIC-VMU participation in the IWSLT 2010 evaluation campaign. The SMT system is a standard phrase-based enriched with novel segmentations. These novel segmentations are computed using statistical measures such as Log-likelihood, T-score, Chi-squared, Dice, Mutual Information or Gravity-Counts. The analysis of translation results allows to divide measures into three groups. First, Log-likelihood, Chi-squared and T-score tend to combine high frequency words and collocation segments are very short. They improve the SMT system by adding new translation units. Second, Mutual Information and Dice tend to combine low frequency words and collocation segments are short. They improve the SMT system by smoothing the translation units. And third, GravityCounts tends to combine high and low frequency words and collocation segments are long. However, in this case, the SMT system is not improved. Thus, the road-map for translation system improvement is to introduce new phrases with either low frequency or high frequency words. It is hard to introduce new phrases with low and high frequency words in order to improve translation quality. Experimental results are reported in the French-to-English IWSLT 2010 evaluation where our system was ranked 3rd out of nine systems.

2009

pdf bib
Barcelona Media SMT system description for the IWSLT 2009
Marta R. Costa-jussà | Rafael E. Banchs
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the Barcelona Media SMT system in the IWSLT 2009 evaluation campaign. The Barcelona Media system is an statistical phrase-based system enriched with source context information. Adding source context in an SMT system is interesting to enhance the translation in order to solve lexical and structural choice errors. The novel technique uses a similarity metric among each test sentence and each training sentence. First experimental results of this technique are reported in the Arabic and Chinese Basic Traveling Expression Corpus (BTEC) task. Although working in a single domain, there are ambiguities in SMT translation units and slight improvements in BLEU are shown in both tasks (Zh2En and Ar2En).

pdf
The TALP-UPC Phrase-Based Translation System for EACL-WMT 2009
José A. R. Fonollosa | Maxim Khalilov | Marta R. Costa-jussà | José B. Mariño | Carlos A. Henríquez Q. | Adolfo Hernández H. | Rafael E. Banchs
Proceedings of the Fourth Workshop on Statistical Machine Translation

2008

pdf
The TALP&I2R SMT systems for IWSLT 2008.
Maxim Khalilov | Maria R. Costa-jussà | Carlos A. Henríquez Q. | José A. R. Fonollosa | Adolfo Hernández H. | José B. Mariño | Rafael E. Banchs | Chen Boxing | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) for our participation in the IWSLT’08 evaluation campaign. We present Ngram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems’ architecture and outlines translation schemes we have used, mainly focusing on the new techniques that are challenged to improve speech-to-speech translation quality. The novelties we have introduced are: improved reordering method, linear combination of translation and reordering models and new technique dealing with punctuation marks insertion for a phrase-based SMT system. This year we focus on the Arabic-English, Chinese-Spanish and pivot Chinese-(English)-Spanish translation tasks.

pdf
The TALP-UPC Ngram-Based Statistical Machine Translation System for ACL-WMT 2008
Maxim Khalilov | Adolfo Hernández H. | Marta R. Costa-jussà | Josep M. Crego | Carlos A. Henríquez Q. | Patrik Lambert | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Third Workshop on Statistical Machine Translation

pdf
Word association models and search strategies for discriminative word alignment
Patrik Lambert | Rafael E. Banchs
Proceedings of the 12th Annual Conference of the European Association for Machine Translation

2007

pdf
Human Evaluation of Machine Translation Through Binary System Comparisons
David Vilar | Gregor Leusch | Hermann Ney | Rafael E. Banchs
Proceedings of the Second Workshop on Statistical Machine Translation

pdf
Ngram-Based Statistical Machine Translation Enhanced with Multiple Weighted Reordering Hypotheses
Marta R. Costa-jussà | Josep M. Crego | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Second Workshop on Statistical Machine Translation

pdf
Discriminative Alignment Training without Annotated Data for Machine Translation
Patrik Lambert | Rafael E. Banchs | Josep M. Crego
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf
The TALP ngram-based SMT system for IWSLT 2007
Patrik Lambert | Marta R. Costa-jussà | Josep M. Crego | Maxim Khalilov | José B. Mariño | Rafael E. Banchs | José A. R. Fonollosa | Holger Schwenk
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model. Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

2006

pdf bib
Grouping Multi-word Expressions According to Part-Of-Speech in Statistical Machine Translation
Patrik Lambert | Rafael Banchs
Proceedings of the Workshop on Multi-word-expressions in a multilingual context

pdf bib
Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output
Maja Popović | Adrià de Gispert | Deepa Gupta | Patrik Lambert | Hermann Ney | José B. Mariño | Marcello Federico | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

pdf
TALP Phrase-based statistical translation system for European language pairs
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José B. Mariño | José A. R. Fonollosa | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

pdf
N-gram-based SMT System Enhanced with Reordering Patterns
Josep M. Crego | Adrià de Gispert | Patrik Lambert | Marta R. Costa-jussà | Maxim Khalilov | Rafael Banchs | José B. Mariño | José A. R. Fonollosa
Proceedings on the Workshop on Statistical Machine Translation

pdf
The TALP Ngram-based SMT systems for IWSLT 2006
Josep M. Crego | Adrià de Gispert | Patrick Lambert | Maxim Khalilov | Marta R. Costa-jussà | José B. Mariño | Rafael Banchs | José A. R. Fonollosa
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf
TALP phrase-based system and TALP system combination for IWSLT 2006
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael Banchs
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf
Tuning machine translation parameters with SPSA
Patrik Lambert | Rafael E. Banchs
Proceedings of the Third International Workshop on Spoken Language Translation: Papers

pdf
N-gram-based Machine Translation
José Mariño | Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José A. R. Fonollosa | Marta R. Costa-jussà
Computational Linguistics, Volume 32, Number 4, December 2006

pdf
Acceptance Testing of a Spoken Language Translation System
Rafael Banchs | Antonio Bonafonte | Javier Pérez
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes an acceptance test procedure for evaluating a spoken language translation system between Catalan and Spanish. The procedure consists of two independent tests. The first test was an utterance-oriented evaluation for determining how the use of speech benefits communication. This test allowed for comparing relative performance of the different system components, explicitly: source text to target text, source text to target speech, source speech to target text, and source speech to target speech. The second test was a task-oriented experiment for evaluating if users could achieve some predefined goals for a given task with the state of the technology. Eight subjects familiar with the technology and four subjects not familiar with the technology participated in the tests. From the results we can conclude that state of technology is getting closer to provide effective speech-to-speech translation systems but there is still lot of work to be done in this area. No significant differences in performance between users that are familiar with the technology and users that are not familiar with the technology were evidenced. This constitutes, as far as we know, the first evaluation of a Spoken Translation System that considers performance at both, the utterance level and the task level.

2005

pdf
Statistical Machine Translation of Euparl Data by using Bilingual N-grams
Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José B. Mariño
Proceedings of the ACL Workshop on Building and Using Parallel Texts

pdf
Bilingual N-gram Statistical Machine Translation
José B. Mariño | Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José A. R. Fonollosa | Marta Ruiz
Proceedings of Machine Translation Summit X: Papers

This paper describes a statistical machine translation system that uses a translation model which is based on bilingual n-grams. When this translation model is log-linearly combined with four specific feature functions, state of the art translations are achieved for Spanish-to-English and English-to-Spanish translation tasks. Some specific results obtained for the EPPS (European Parliament Plenary Sessions) data are presented and discussed. Finally, future research issues are depicted.

pdf
Data Inferred Multi-word Expressions for Statistical Machine Translation
Patrick Lambert | Rafael Banchs
Proceedings of Machine Translation Summit X: Posters

This paper presents a strategy for detecting and using multi-word expressions in Statistical Machine Translation. Performance of the proposed strategy is evaluated in terms of alignment quality as well as translation accuracy. Evaluations are performed by using the Verbmobil corpus. Results from translation tasks from English-to-Spanish and from Spanish-to-English are presented and discussed.