Patrik Lambert


2019

pdf
Improving Robustness in Real-World Neural Machine Translation Engines
Rohit Gupta | Patrik Lambert | Raj Patel | John Tinsley
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf
Attention and Lexicon Regularized LSTM for Aspect-based Sentiment Analysis
Lingxian Bao | Patrik Lambert | Toni Badia
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Attention based deep learning systems have been demonstrated to be the state of the art approach for aspect-level sentiment analysis, however, end-to-end deep neural networks lack flexibility as one can not easily adjust the network to fix an obvious problem, especially when more training data is not available: e.g. when it always predicts positive when seeing the word disappointed. Meanwhile, it is less stressed that attention mechanism is likely to “over-focus” on particular parts of a sentence, while ignoring positions which provide key information for judging the polarity. In this paper, we describe a simple yet effective approach to leverage lexicon information so that the model becomes more flexible and robust. We also explore the effect of regularizing attention vectors to allow the network to have a broader “focus” on different parts of the sentence. The experimental results demonstrate the effectiveness of our approach.

2018

pdf
MultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification
Jeremy Barnes | Toni Badia | Patrik Lambert
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf
Exploring Distributional Representations and Machine Translation for Aspect-based Cross-lingual Sentiment Classification.
Jeremy Barnes | Patrik Lambert | Toni Badia
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Cross-lingual sentiment classification (CLSC) seeks to use resources from a source language in order to detect sentiment and classify text in a target language. Almost all research into CLSC has been carried out at sentence and document level, although this level of granularity is often less useful. This paper explores methods for performing aspect-based cross-lingual sentiment classification (aspect-based CLSC) for under-resourced languages. Given the limited nature of parallel data for many languages, we would like to make the most of this resource for our task. We compare zero-shot learning, bilingual word embeddings, stacked denoising autoencoder representations and machine translation techniques for aspect-based CLSC. Each of these approaches requires differing amounts of parallel data. We show that models based on distributed semantics can achieve comparable results to machine translation on aspect-based CLSC and give an analysis of the errors found for each method.

pdf bib
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)
Patrik Lambert | Bogdan Babych | Kurt Eberle | Rafael E. Banchs | Reinhard Rapp | Marta R. Costa-jussà
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)

2015

pdf
Aspect-Level Cross-lingual Sentiment Classification with Constrained SMT
Patrik Lambert
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)
Bogdan Babych | Kurt Eberle | Patrik Lambert | Reinhard Rapp | Rafael E. Banchs | Marta R. Costa-jussà
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)

2014

pdf
Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese
Patrik Lambert | Carlos Rodríguez-Penagos
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a complete UIMA-based pipeline for sentiment analysis in Portuguese news using freely available resources and a minimal set of manually annotated training data. We obtained good precision on binary classification but concluded that news feed is a challenging environment to detect the extent of opinionated text.

pdf bib
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)
Rafael E. Banchs | Marta R. Costa-jussà | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Bogdan Babych
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

2013

pdf bib
Proceedings of the Second Workshop on Hybrid Approaches to Translation
Marta Ruiz Costa-jussà | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Rafael E. Banchs | Bogdan Babych
Proceedings of the Second Workshop on Hybrid Approaches to Translation

pdf bib
Workshop on Hybrid Approaches to Translation: Overview and Developments
Marta R. Costa-jussà | Rafael Banchs | Reinhard Rapp | Patrik Lambert | Kurt Eberle | Bogdan Babych
Proceedings of the Second Workshop on Hybrid Approaches to Translation

pdf
FBM: Combining lexicon-based ML and heuristics for Social Media Polarities
Carlos Rodríguez-Penagos | Jordi Atserias Batalla | Joan Codina-Filbà | David García-Narbona | Jens Grivolla | Patrik Lambert | Roser Saurí
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf
Collaborative Machine Translation Service for Scientific texts
Patrik Lambert | Jean Senellart | Laurent Romary | Holger Schwenk | Florian Zipser | Patrice Lopez | Frédéric Blain
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
Marta R. Costa-jussà | Patrik Lambert | Rafael E. Banchs | Reinhard Rapp | Bogdan Babych
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

pdf
LIUM’s SMT Machine Translation Systems for WMT 2012
Christophe Servan | Patrik Lambert | Anthony Rousseau | Holger Schwenk | Loïc Barrault
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf
Automatic Translation of Scientific Documents in the HAL Archive
Patrik Lambert | Holger Schwenk | Frédéric Blain
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the development of a statistical machine translation system between French and English for scientific papers. This system will be closely integrated into the French HAL open archive, a collection of more than 100.000 scientific papers. We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the created resources, and its adaptation using monolingual resources only. These techniques allowed us to improve a generic system by more than 10 BLEU points.

2011

pdf
Investigations on Translation Model Adaptation Using Monolingual Data
Patrik Lambert | Holger Schwenk | Christophe Servan | Sadaf Abdul-Rauf
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf
LIUM’s SMT Machine Translation Systems for WMT 2011
Holger Schwenk | Patrik Lambert | Loïc Barrault | Christophe Servan | Sadaf Abdul-Rauf | Haithem Afli | Kashif Shah
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf
Statistical Analysis of Alignment Characteristics for Phrase-based Machine Translation
Patrik Lambert | Simon Petitrenaud | Yanjun Ma | Andy Way
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

pdf
LIUM SMT Machine Translation System for WMT 2010
Patrik Lambert | Sadaf Abdul-Rauf | Holger Schwenk
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

2009

pdf
LIUM’s statistical machine translation system for IWSLT 2009
Holger Schwenk | Loïc Barrault | Yannick Estève | Patrik Lambert
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the systems developed by the LIUM laboratory for the 2009 IWSLT evaluation. We participated in the Arabic and Chinese to English BTEC tasks. We developed three different systems: a statistical phrase-based system using the Moses toolkit, an Statistical Post-Editing system and a hierarchical phrase-based system based on Joshua. A continuous space language model was deployed to improve the modeling of the target language. These systems are combined by a confusion network based approach.

pdf
Tracking Relevant Alignment Characteristics for Machine Translation
Patrik Lambert | Yanjun Ma | Sylwia Ozdowska | Andy Way
Proceedings of Machine Translation Summit XII: Posters

pdf
Tuning Syntactically Enhanced Word Alignment for Statistical Machine Translation
Yanjun Ma | Patrik Lambert | Andy Way
Proceedings of the 13th Annual Conference of the European Association for Machine Translation

2008

pdf
Word association models and search strategies for discriminative word alignment
Patrik Lambert | Rafael E. Banchs
Proceedings of the 12th Annual Conference of the European Association for Machine Translation

pdf
The TALP-UPC Ngram-Based Statistical Machine Translation System for ACL-WMT 2008
Maxim Khalilov | Adolfo Hernández H. | Marta R. Costa-jussà | Josep M. Crego | Carlos A. Henríquez Q. | Patrik Lambert | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Third Workshop on Statistical Machine Translation

2007

pdf
Ngram-Based Statistical Machine Translation Enhanced with Multiple Weighted Reordering Hypotheses
Marta R. Costa-jussà | Josep M. Crego | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Second Workshop on Statistical Machine Translation

pdf
The TALP ngram-based SMT system for IWSLT 2007
Patrik Lambert | Marta R. Costa-jussà | Josep M. Crego | Maxim Khalilov | José B. Mariño | Rafael E. Banchs | José A. R. Fonollosa | Holger Schwenk
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model. Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

pdf
Discriminative Alignment Training without Annotated Data for Machine Translation
Patrik Lambert | Rafael E. Banchs | Josep M. Crego
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

2006

pdf
TALP phrase-based system and TALP system combination for IWSLT 2006
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael Banchs
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf
Tuning machine translation parameters with SPSA
Patrik Lambert | Rafael E. Banchs
Proceedings of the Third International Workshop on Spoken Language Translation: Papers

pdf
N-gram-based Machine Translation
José Mariño | Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José A. R. Fonollosa | Marta R. Costa-jussà
Computational Linguistics, Volume 32, Number 4, December 2006

pdf bib
Grouping Multi-word Expressions According to Part-Of-Speech in Statistical Machine Translation
Patrik Lambert | Rafael Banchs
Proceedings of the Workshop on Multi-word-expressions in a multilingual context

pdf bib
Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output
Maja Popović | Adrià de Gispert | Deepa Gupta | Patrik Lambert | Hermann Ney | José B. Mariño | Marcello Federico | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

pdf
TALP Phrase-based statistical translation system for European language pairs
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José B. Mariño | José A. R. Fonollosa | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

pdf
N-gram-based SMT System Enhanced with Reordering Patterns
Josep M. Crego | Adrià de Gispert | Patrik Lambert | Marta R. Costa-jussà | Maxim Khalilov | Rafael Banchs | José B. Mariño | José A. R. Fonollosa
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf
Statistical Machine Translation of Euparl Data by using Bilingual N-grams
Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José B. Mariño
Proceedings of the ACL Workshop on Building and Using Parallel Texts

pdf
Bilingual N-gram Statistical Machine Translation
José B. Mariño | Rafael E. Banchs | Josep M. Crego | Adrià de Gispert | Patrik Lambert | José A. R. Fonollosa | Marta Ruiz
Proceedings of Machine Translation Summit X: Papers

This paper describes a statistical machine translation system that uses a translation model which is based on bilingual n-grams. When this translation model is log-linearly combined with four specific feature functions, state of the art translations are achieved for Spanish-to-English and English-to-Spanish translation tasks. Some specific results obtained for the EPPS (European Parliament Plenary Sessions) data are presented and discussed. Finally, future research issues are depicted.

2004

pdf
Bilingual Connections for Trilingual Corpora: An XML Approach
Victoria Arranz | Núria Castell | Josep Maria Crego | Jesús Giménez | Adrià de Gispert | Patrik Lambert
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)