2020
pdf
abs
Prédiction continue de la satisfaction et de la frustration dans des conversations de centre d’appels (AlloSat : A New Call Center French Corpus for Affect Analysis)
Manon Macary
|
Marie Tahon
|
Yannick Estève
|
Anthony Rousseau
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole
Nous présentons un nouveau corpus, nommé AlloSat, composé de conversations en français extraites de centre d’appels, annotées de façon continue en frustration et satisfaction. Dans le contexte des centres d’appels, une conversation vise généralement à résoudre la demande de l’appelant. Ce corpus a été mis en place afin de développer de nouveaux systèmes capables de modéliser l’aspect continu de l’information sémantique et para-linguistique au niveau conversationnel. Nous nous concentrons sur le niveau para-linguistique, plus précisément sur l’expression des émotions. À notre connaissance, la plupart des corpus émotionnels contiennent des annotations en catégories discrètes ou dans des dimensions continues telles que l’activation ou la valence. Nous supposons que ces dimensions ne sont pas suffisamment liées à notre contexte. Pour résoudre ce problème, nous proposons un corpus permettant une connaissance en temps réel de l’axe frustration/satisfaction. AlloSat regroupe 303 conversations pour un total d’environ 37 heures d’audio, toutes enregistrées dans des environnements réels, collectées par Allo-Media (une société spécialisée dans l’analyse automatique d’appels). Les premières expériences de classification montrent que l’évolution de l’axe frustration/satisfaction peut être prédite automatiquement par conversation.
pdf
abs
AlloSat: A New Call Center French Corpus for Satisfaction and Frustration Analysis
Manon Macary
|
Marie Tahon
|
Yannick Estève
|
Anthony Rousseau
Proceedings of the Twelfth Language Resources and Evaluation Conference
We present a new corpus, named AlloSat, composed of real-life call center conversations in French that is continuously annotated in frustration and satisfaction. This corpus has been set up to develop new systems able to model the continuous aspect of semantic and paralinguistic information at the conversation level. The present work focuses on the paralinguistic level, more precisely on the expression of emotions. In the call center industry, the conversation usually aims at solving the caller’s request. As far as we know, most emotional databases contain static annotations in discrete categories or in dimensions such as activation or valence. We hypothesize that these dimensions are not task-related enough. Moreover, static annotations do not enable to explore the temporal evolution of emotional states. To solve this issue, we propose a corpus with a rich annotation scheme enabling a real-time investigation of the axis frustration / satisfaction. AlloSat regroups 303 conversations with a total of approximately 37 hours of audio, all recorded in real-life environments collected by Allo-Media (an intelligent call tracking company). First regression experiments, with audio features, show that the evolution of frustration / satisfaction axis can be retrieved automatically at the conversation level.
2015
pdf
The LIUM ASR and SLT systems for IWSLT 2015
Mercedes Garcia Martínez
|
Loïc Barrault
|
Anthony Rousseau
|
Paul Deléglise
|
Yannick Estève
Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign
2014
pdf
abs
LIUM English-to-French spoken language translation system and the Vecsys/LIUM automatic speech recognition system for Italian language for IWSLT 2014
Anthony Rousseau
|
Loïc Barrault
|
Paul Deléglise
|
Yannick Estève
|
Holger Schwenk
|
Samir Bennacef
|
Armando Muscariello
|
Stephan Vanni
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper describes the Spoken Language Translation system developed by the LIUM for the IWSLT 2014 evaluation campaign. We participated in two of the proposed tasks: (i) the Automatic Speech Recognition task (ASR) in two languages, Italian with the Vecsys company, and English alone, (ii) the English to French Spoken Language Translation task (SLT). We present the approaches and specificities found in our systems, as well as the results from the evaluation campaign.
pdf
abs
Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks
Anthony Rousseau
|
Paul Deléglise
|
Yannick Estève
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In this paper, we present improvements made to the TED-LIUM corpus we released in 2012. These enhancements fall into two categories. First, we describe how we filtered publicly available monolingual data and used it to estimate well-suited language models (LMs), using open-source tools. Then, we describe the process of selection we applied to new acoustic data from TED talks, providing additions to our previously released corpus. Finally, we report some experiments we made around these improvements.
2012
pdf
abs
TED-LIUM: an Automatic Speech Recognition dedicated corpus
Anthony Rousseau
|
Paul Deléglise
|
Yannick Estève
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper presents the corpus developed by the LIUM for Automatic Speech Recognition (ASR), based on the TED Talks. This corpus was built during the IWSLT 2011 Evaluation Campaign, and is composed of 118 hours of speech with its accompanying automatically aligned transcripts. We describe the content of the corpus, how the data was collected and processed, how it will be publicly available and how we built an ASR system using this data leading to a WER score of 17.4 %. The official results we obtained at the IWSLT 2011 evaluation campaign are also discussed.
pdf
bib
Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation
Holger Schwenk
|
Anthony Rousseau
|
Mohammed Attik
Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT
pdf
LIUM’s SMT Machine Translation Systems for WMT 2012
Christophe Servan
|
Patrik Lambert
|
Anthony Rousseau
|
Holger Schwenk
|
Loïc Barrault
Proceedings of the Seventh Workshop on Statistical Machine Translation
2011
pdf
abs
LIUM’s systems for the IWSLT 2011 speech translation tasks
Anthony Rousseau
|
Fethi Bougares
|
Paul Deléglise
|
Holger Schwenk
|
Yannick Estève
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper describes the three systems developed by the LIUM for the IWSLT 2011 evaluation campaign. We participated in three of the proposed tasks, namely the Automatic Speech Recognition task (ASR), the ASR system combination task (ASR_SC) and the Spoken Language Translation task (SLT), since these tasks are all related to speech translation. We present the approaches and specificities we developed on each task.
2010
pdf
abs
LIUM’s statistical machine translation system for IWSLT 2010
Anthony Rousseau
|
Loïc Barrault
|
Paul Deléglise
|
Yannick Estève
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper describes the two systems developed by the LIUM laboratory for the 2010 IWSLT evaluation campaign. We participated to the new English to French TALK task. We developed two systems, one for each evaluation condition, both being statistical phrase-based systems using the the Moses toolkit. Several approaches were investigated.