Patrik Lambert


2019

Attention based deep learning systems have been demonstrated to be the state of the art approach for aspect-level sentiment analysis, however, end-to-end deep neural networks lack flexibility as one can not easily adjust the network to fix an obvious problem, especially when more training data is not available: e.g. when it always predicts positive when seeing the word disappointed. Meanwhile, it is less stressed that attention mechanism is likely to “over-focus” on particular parts of a sentence, while ignoring positions which provide key information for judging the polarity. In this paper, we describe a simple yet effective approach to leverage lexicon information so that the model becomes more flexible and robust. We also explore the effect of regularizing attention vectors to allow the network to have a broader “focus” on different parts of the sentence. The experimental results demonstrate the effectiveness of our approach.

2018

2016

Cross-lingual sentiment classification (CLSC) seeks to use resources from a source language in order to detect sentiment and classify text in a target language. Almost all research into CLSC has been carried out at sentence and document level, although this level of granularity is often less useful. This paper explores methods for performing aspect-based cross-lingual sentiment classification (aspect-based CLSC) for under-resourced languages. Given the limited nature of parallel data for many languages, we would like to make the most of this resource for our task. We compare zero-shot learning, bilingual word embeddings, stacked denoising autoencoder representations and machine translation techniques for aspect-based CLSC. Each of these approaches requires differing amounts of parallel data. We show that models based on distributed semantics can achieve comparable results to machine translation on aspect-based CLSC and give an analysis of the errors found for each method.

2015

2014

We present a complete UIMA-based pipeline for sentiment analysis in Portuguese news using freely available resources and a minimal set of manually annotated training data. We obtained good precision on binary classification but concluded that news feed is a challenging environment to detect the extent of opinionated text.

2013

2012

This paper describes the development of a statistical machine translation system between French and English for scientific papers. This system will be closely integrated into the French HAL open archive, a collection of more than 100.000 scientific papers. We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the created resources, and its adaptation using monolingual resources only. These techniques allowed us to improve a generic system by more than 10 BLEU points.

2011

2010

2009

This paper describes the systems developed by the LIUM laboratory for the 2009 IWSLT evaluation. We participated in the Arabic and Chinese to English BTEC tasks. We developed three different systems: a statistical phrase-based system using the Moses toolkit, an Statistical Post-Editing system and a hierarchical phrase-based system based on Joshua. A continuous space language model was deployed to improve the modeling of the target language. These systems are combined by a confusion network based approach.

2008

2007

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model. Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

2006

2005

This paper describes a statistical machine translation system that uses a translation model which is based on bilingual n-grams. When this translation model is log-linearly combined with four specific feature functions, state of the art translations are achieved for Spanish-to-English and English-to-Spanish translation tasks. Some specific results obtained for the EPPS (European Parliament Plenary Sessions) data are presented and discussed. Finally, future research issues are depicted.

2004