Benoit Favre

Also published as: Benoît Favre


“Do you follow me?”: A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin | Lina M. Rojas Barahona | Benoit Favre
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

While communicating with a user, a task-oriented dialogue system has to track the user’s needs at each turn according to the conversation history. This process called dialogue state tracking (DST) is crucial because it directly informs the downstream dialogue policy. DST has received a lot of interest in recent years with the text-to-text paradigm emerging as the favored approach. In this review paper, we first present the task and its associated datasets. Then, considering a large number of recent publications, we identify highlights and advances of research in 2021-2022. Although neural approaches have enabled significant progress, we argue that some critical aspects of dialogue systems such as generalizability are still underexplored. To motivate future studies, we propose several research avenues.

Zero-Shot Aspect-Based Scientific Document Summarization using Self-Supervised Pre-training
Amir Soleimani | Vassilina Nikoulina | Benoit Favre | Salah Ait Mokhtar
Proceedings of the 21st Workshop on Biomedical Language Processing

We study the zero-shot setting for the aspect-based scientific document summarization task. Summarizing scientific documents with respect to an aspect can remarkably improve document assistance systems and readers experience. However, existing large-scale datasets contain a limited variety of aspects, causing summarization models to over-fit to a small set of aspects and a specific domain. We establish baseline results in zero-shot performance (over unseen aspects and the presence of domain shift), paraphrasing, leave-one-out, and limited supervised samples experimental setups. We propose a self-supervised pre-training approach to enhance the zero-shot performance. We leverage the PubMed structured abstracts to create a biomedical aspect-based summarization dataset. Experimental results on the PubMed and FacetSum aspect-based datasets show promising performance when the model is pre-trained using unlabelled in-domain data.

pdf bib
Abstraction ou hallucination ? État des lieux et évaluation du risque pour les modèles de génération de résumés automatiques de type séquence-à-séquence (Abstraction or Hallucination ? Status and Risk assessment for sequence-to-sequence Automatic)
Eunice Akani | Benoit Favre | Frederic Bechet
Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale

La génération de texte a récemment connu un très fort intérêt au vu des avancées notables dans le domaine des modèles de langage neuronaux. Malgré ces avancées, cette tâche reste difficile quand il s’agit d’un résumé automatique de texte par abstraction. Certains systèmes de résumés génèrent des textes qui ne sont pas forcément fidèles au document source. C’est sur cette thématique que porte notre étude. Nous présentons une typologie d’erreurs pour les résumés automatique et ainsi qu’une caractérisation du phénomène de l’abstraction pour les résumés de référence afin de mieux comprendre l’ampleur de ces différents phénomènes sur les entités nommées. Nous proposons également une mesure d’évaluation du risque d’erreur lorsqu’un système tente de faire des abstractions sur les entités nommées d’un document.

Simulation d’erreurs d’OCR dans les systèmes de TAL pour le traitement de données anachroniques (Simulation of OCR errors in NLP systems for processing anachronistic data)
Baptiste Blouin | Benoit Favre | Jeremy Auguste
Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Atelier TAL et Humanités Numériques (TAL-HN)

L’extraction d’information offre de nouvelles perspectives au sein des recherches historiques. Cependant, la majorité des recherches liées à ce domaine s’effectue sur des données contemporaines. Malgré l’évolution constante des systèmes d’OCR, les textes historiques résultant de ce procédé contiennent toujours de multiples erreurs. Du fait d’un manque de ressources historiques dédiées au TAL, le traitement de ce domaine reste dépendant de l’utilisation de ressources contemporaines. De nombreuses études ont démontré l’impact négatif que pouvaient avoir les erreurs d’OCR sur les systèmes prêts à l’emploi contemporains. Mais l’évaluation des nouvelles architectures, proposant des résultats prometteurs sur des données récentes, face à ce problème reste encore très minime. Dans cette étude, nous quantifions l’impact des erreurs d’OCR sur trois tâches d’extraction d’information en utilisant plusieurs architectures de type Transformers. Au vu de ces résultats, nous proposons une approche permettant de réduire de plus de 50% cet impact sans avoir recours à des ressources historiques spécialisées.

Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?
Mitja Nikolaus | Emmanuelle Salin | Stephane Ayache | Abdellah Fourtassi | Benoit Favre
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recent advances in vision-and-language modeling have seen the development of Transformer architectures that achieve remarkable performance on multimodal reasoning tasks.Yet, the exact capabilities of these black-box models are still poorly understood. While much of previous work has focused on studying their ability to learn meaning at the word-level, their ability to track syntactic dependencies between words has received less attention.We take a first step in closing this gap by creating a new multimodal task targeted at evaluating understanding of predicate-noun dependencies in a controlled setup.We evaluate a range of state-of-the-art models and find that their performance on the task varies considerably, with some models performing relatively well and others at chance level. In an effort to explain this variability, our analyses indicate that the quality (and not only sheer quantity) of pretraining data is essential. Additionally, the best performing models leverage fine-grained multimodal pretraining objectives in addition to the standard image-text matching objectives.This study highlights that targeted and controlled evaluations are a crucial step for a precise and rigorous test of the multimodal knowledge of vision-and-language models.

pdf bib
Using ASR-Generated Text for Spoken Language Modeling
Nicolas Hervé | Valentin Pelloin | Benoit Favre | Franck Dary | Antoine Laurent | Sylvain Meignier | Laurent Besacier
Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models

This papers aims at improving spoken language modeling (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken language models are trained either by fine-tuning an existing LM (FlauBERT) or through training a LM from scratch.The new models (FlauBERT-Oral) will be shared with the community and are evaluated not only in terms of word prediction accuracy but also for two downstream tasks : classification of TV shows and syntactic parsing of speech. Experimental results show that FlauBERT-Oral is better than its initial FlauBERT version demonstrating that, despite its inherent noisy nature, ASR-Generated text can be useful to improve spoken language modeling.


Transferring Modern Named Entity Recognition to the Historical Domain: How to Take the Step?
Baptiste Blouin | Benoit Favre | Jeremy Auguste | Christian Henriot
Proceedings of the Workshop on Natural Language Processing for Digital Humanities

Named entity recognition is of high interest to digital humanities, in particular when mining historical documents. Although the task is mature in the field of NLP, results of contemporary models are not satisfactory on challenging documents corresponding to out-of-domain genres, noisy OCR output, or old-variants of the target language. In this paper we study how model transfer methods, in the context of the aforementioned challenges, can improve historical named entity recognition according to how much effort is allocated to describing the target data, manually annotating small amounts of texts, or matching pre-training resources. In particular, we explore the situation where the class labels, as well as the quality of the documents to be processed, are different in the source and target domains. We perform extensive experiments with the transformer architecture on the LitBank and HIPE historical datasets, with different annotation schemes and character-level noise. They show that annotating 250 sentences can recover 93% of the full-data performance when models are pre-trained, that the choice of self-supervised and target-task pre-training data is crucial in the zero-shot setting, and that OCR errors can be handled by simulating noise on pre-training data and resorting to recent character-aware transformers.


Filtering conversations through dialogue acts labels for improving corpus-based convergence studies
Simone Fuscone | Benoit Favre | Laurent Prévot
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Cognitive models of conversation and research on user-adaptation in dialogue systems involves a better understanding of speakers convergence in conversation. Convergence effects have been established on controlled data sets, for various acoustic and linguistic variables. Tracking interpersonal dynamics on generic corpora has provided positive but more contrasted outcomes. We propose here to enrich large conversational corpora with dialogue act (DA) information. We use DA-labels as filters in order to create data sub sets featuring homogeneous conversational activity. Those data sets allow a more precise comparison between speakers’ speech variables. Our experiences consist of comparing convergence on low level variables (Energy, Pitch, Speech Rate) measured on raw data sets, with human and automatically DA-labelled data sets. We found that such filtering does help in observing convergence suggesting that studies on interpersonal dynamics should consider such high level dialogue activity types and their related NLP topics as important ingredients of their toolboxes.

Analyse sémantique robuste par apprentissage antagoniste pour la généralisation de domaine (Robust Semantic Parsing with Adversarial Learning for Domain Generalization )
Gabriel Marzinotto | Géraldine Damnati | Frédéric Béchet | Benoît Favre
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d'articles internationaux

Nous présentons des résumés en français et en anglais de l’article (Marzinotto et al., 2019) présenté à la conférence North American Chapter of the Association for Computational Linguistics : Human Language Technologies en 2019.

Development of Multi-level Linguistic Alignment in Child-adult Conversations
Thomas Misiek | Benoit Favre | Abdellah Fourtassi
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Interactive alignment is a major mechanism of linguistic coordination. Here we study the way this mechanism emerges in development across the lexical, syntactic, and conceptual levels. We leverage NLP tools to analyze a large-scale corpus of child-adult conversations between 2 and 5 years old. We found that, across development, children align consistently to adults above chance and that adults align consistently more to children than vice versa (even controlling for language production abilities). Besides these consistencies, we found a diversity of developmental trajectories across linguistic levels. These corpus-based findings provide strong support for an early onset of multi-level linguistic alignment in children and invites new experimental work.


Typological Features for Multilingual Delexicalised Dependency Parsing
Manon Scholivet | Franck Dary | Alexis Nasr | Benoit Favre | Carlos Ramisch
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The existence of universal models to describe the syntax of languages has been debated for decades. The availability of resources such as the Universal Dependencies treebanks and the World Atlas of Language Structures make it possible to study the plausibility of universal grammar from the perspective of dependency parsing. Our work investigates the use of high-level language descriptions in the form of typological features for multilingual dependency parsing. Our experiments on multilingual parsing for 40 languages show that typological information can indeed guide parsers to share information between similar languages beyond simple language identification.

Robust Semantic Parsing with Adversarial Learning for Domain Generalization
Gabriel Marzinotto | Géraldine Damnati | Frédéric Béchet | Benoît Favre
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

This paper addresses the issue of generalization for Semantic Parsing in an adversarial framework. Building models that are more robust to inter-document variability is crucial for the integration of Semantic Parsing technologies in real applications. The underlying question throughout this study is whether adversarial learning can be used to train models on a higher level of abstraction in order to increase their robustness to lexical and stylistic variations. We propose to perform Semantic Parsing with a domain classification adversarial task, covering various use-cases with or without explicit knowledge of the domain. The strategy is first evaluated on a French corpus of encyclopedic documents, annotated with FrameNet, in an information retrieval perspective. This corpus constitutes a new public benchmark, gathering documents from various thematic domains and various sources. We show that adversarial learning yields improved results when using explicit domain classification as the adversarial task. We also propose an unsupervised domain discovery approach that yields equivalent improvements. The latter is also evaluated on a PropBank Semantic Role Labeling task on the CoNLL-2005 benchmark and is shown to increase the model’s generalization capabilities on out-of-domain data.


Adding Syntactic Annotations to Flickr30k Entities Corpus for Multimodal Ambiguous Prepositional-Phrase Attachment Resolution
Sebastien Delecraz | Alexis Nasr | Frederic Bechet | Benoit Favre
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Correction automatique d’attachements prépositionnels par utilisation de traits visuels (PP-attachement resolution using visual features)
Sébastien Delecraz | Leonor Becerra-Bonache | Benoît Favre | Alexis Nasr | Frédéric Bechet
Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN

La désambiguïsation des rattachements prépositionnels est une tâche syntaxique qui demande des connaissances sémantiques, pouvant être extraites d’une image associée au texte traité. Nous présentons et analysons les difficultés de cette tâche pour laquelle nous construisons un système complet entraîné sur une version étendue des annotations du corpus Flickr30k Entities. Lorsque la sémantique lexicale n’est pas disponible, l’information visuelle apporte 3 % d’amélioration.

Evaluation automatique de la satisfaction client à partir de conversations de type “chat” par réseaux de neurones récurrents avec mécanisme d’attention (Customer satisfaction prediction with attention-based RNNs from a chat contact center corpus)
Jeremy Auguste | Delphine Charlet | Géraldine Damnati | Benoit Favre | Frederic Bechet
Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN

Cet article présente des méthodes permettant l’évaluation de la satisfaction client à partir de très vastes corpus de conversation de type “chat” entre des clients et des opérateurs. Extraire des connaissances dans ce contexte demeure un défi pour les méthodes de traitement automatique des langues de par la dimension interactive et les propriétés de ce nouveau type de langage à l’intersection du langage écrit et parlé. Nous présentons une étude utilisant des réponses à des sondages utilisateurs comme supervision faible permettant de prédire la satisfaction des usagers d’un service en ligne d’assistance technique et commerciale.

Détection d’erreurs dans des transcriptions OCR de documents historiques par réseaux de neurones récurrents multi-niveau (Combining character level and word level RNNs for post-OCR error detection)
Thibault Magallon | Frederic Bechet | Benoit Favre
Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN

Le traitement à posteriori de transcriptions OCR cherche à détecter les erreurs dans les sorties d’OCR pour tenter de les corriger, deux tâches évaluées par la compétition ICDAR-2017 Post-OCR Text Correction. Nous présenterons dans ce papier un système de détection d’erreurs basé sur un modèle à réseaux récurrents combinant une analyse du texte au niveau des mots et des caractères en deux temps. Ce système a été classé second dans trois catégories évaluées parmi 11 candidats lors de la compétition.

Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification
Nicolas Zampieri | Manon Scholivet | Carlos Ramisch | Benoit Favre
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This paper describes the Veyn system, submitted to the closed track of the PARSEME Shared Task 2018 on automatic identification of verbal multiword expressions (VMWEs). Veyn is based on a sequence tagger using recurrent neural networks. We represent VMWEs using a variant of the begin-inside-outside encoding scheme combined with the VMWE category tag. In addition to the system description, we present development experiments to determine the best tagging scheme. Veyn is freely available, covers 19 languages, and was ranked ninth (MWE-based) and eight (Token-based) among 13 submissions, considering macro-averaged F1 across languages.


pdf bib
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
George Giannakopoulos | Elena Lloret | John M. Conroy | Josef Steinberger | Marina Litvak | Peter Rankel | Benoit Favre
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres

pdf bib
MultiLing 2017 Overview
George Giannakopoulos | John Conroy | Jeff Kubina | Peter A. Rankel | Elena Lloret | Josef Steinberger | Marina Litvak | Benoit Favre
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres

In this brief report we present an overview of the MultiLing 2017 effort and workshop, as implemented within EACL 2017. MultiLing is a community-driven initiative that pushes the state-of-the-art in Automatic Summarization by providing data sets and fostering further research and development of summarization systems. This year the scope of the workshop was widened, bringing together researchers that work on summarization across sources, languages and genres. We summarize the main tasks planned and implemented this year, the contributions received, and we also provide insights on next steps.

Evaluation of word embeddings against cognitive processes: primed reaction times in lexical decision and naming tasks
Jeremy Auguste | Arnaud Rey | Benoit Favre
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP

This work presents a framework for word similarity evaluation grounded on cognitive sciences experimental data. Word pair similarities are compared to reaction times of subjects in large scale lexical decision and naming tasks under semantic priming. Results show that GloVe embeddings lead to significantly higher correlation with experimental measurements than other controlled and off-the-shelf embeddings, and that the choice of a training corpus is less important than that of the algorithm. Comparison of rankings with other datasets shows that the cognitive phenomenon covers more aspects than simply word relatedness or similarity.

Correcting prepositional phrase attachments using multimodal corpora
Sebastien Delecraz | Alexis Nasr | Frederic Bechet | Benoit Favre
Proceedings of the 15th International Conference on Parsing Technologies

PP-attachments are an important source of errors in parsing natural language. We propose in this article to use data coming from a multimodal corpus, combining textual, visual and conceptual information, as well as a correction strategy, to propose alternative attachments in the output of a parser.

Détection de coréférences de bout en bout en français (End-to-end coreference resolution for French)
Elisabeth Godbert | Benoit Favre
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 2 - Articles courts

Notre objectif est l’élaboration d’un système de détection automatique de relations de coréférence le plus général possible, pour le traitement des anaphores pronominales et les coréférences directes. Nous décrivons dans cet article les différentes étapes de traitement des textes dans le système que nous avons développé : (i) l’annotation en traits lexicaux et syntaxiques par le système Macaon ; (ii) le repérage des mentions par un modèle obtenu par apprentissage sur le corpus ANCOR ; (iii) l’annotation sémantique des mentions à partir de deux ressources : le DEM et le LVF ; (iv) l’annotation en coréférences par un système à base de règles. Le système est évalué sur le corpus ANCOR.

Apprentissage d’agents conversationnels pour la gestion de relations clients (Training chatbots for customer relation management)
Benoit Favre | Frederic Bechet | Géraldine Damnati | Delphine Charlet
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 3 - Démonstrations

Ce travail démontre la faisabilité d’entraîner des chatbots sur des traces de conversations dans le domaine de la relation client. Des systèmes à base de modèles de langage, de recherche d’information et de traduction sont comparés pour la tâche.


Word Embedding Evaluation and Combination
Sahar Ghannay | Benoit Favre | Yannick Estève | Nathalie Camelin
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Word embeddings have been successfully used in several natural language processing tasks (NLP) and speech processing. Different approaches have been introduced to calculate word embeddings through neural networks. In the literature, many studies focused on word embedding evaluation, but for our knowledge, there are still some gaps. This paper presents a study focusing on a rigorous comparison of the performances of different kinds of word embeddings. These performances are evaluated on different NLP and linguistic tasks, while all the word embeddings are estimated on the same training data using the same vocabulary, the same number of dimensions, and other similar characteristics. The evaluation results reported in this paper match those in the literature, since they point out that the improvements achieved by a word embedding in one task are not consistently observed across all tasks. For that reason, this paper investigates and evaluates approaches to combine word embeddings in order to take advantage of their complementarity, and to look for the effective word embeddings that can achieve good performances on all tasks. As a conclusion, this paper provides new perceptions of intrinsic qualities of the famous word embedding families, which can be different from the ones provided by works previously published in the scientific literature.

A Document Repository for Social Media and Speech Conversations
Adam Funk | Robert Gaizauskas | Benoit Favre
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a successfully implemented document repository REST service for flexible SCRUD (search, crate, read, update, delete) storage of social media conversations, using a GATE/TIPSTER-like document object model and providing a query language for document features. This software is currently being used in the SENSEI research project and will be published as open-source software before the project ends. It is, to the best of our knowledge, the first freely available, general purpose data repository to support large-scale multimodal (i.e., speech or text) conversation analytics.

Summarizing Behaviours: An Experiment on the Annotation of Call-Centre Conversations
Morena Danieli | Balamurali A R | Evgeny Stepanov | Benoit Favre | Frederic Bechet | Giuseppe Riccardi
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Annotating and predicting behavioural aspects in conversations is becoming critical in the conversational analytics industry. In this paper we look into inter-annotator agreement of agent behaviour dimensions on two call center corpora. We find that the task can be annotated consistently over time, but that subjectivity issues impacts the quality of the annotation. The reformulation of some of the annotated dimensions is suggested in order to improve agreement.

Fusion d’espaces de représentations multimodaux pour la reconnaissance du rôle du locuteur dans des documents télévisuels (Multimodal embedding fusion for robust speaker role recognition in video broadcast )
Sebastien Delecraz | Frederic Bechet | Benoit Favre | Mickael Rouvier
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP

L’identification du rôle d’un locuteur dans des émissions de télévision est un problème de classification de personne selon une liste de rôles comme présentateur, journaliste, invité, etc. À cause de la nonsynchronie entre les modalités, ainsi que par le manque de corpus de vidéos annotées dans toutes les modalités, seulement une des modalités est souvent utilisée. Nous présentons dans cet article une fusion multimodale des espaces de représentations de l’audio, du texte et de l’image pour la reconnaissance du rôle du locuteur pour des données asynchrones. Les espaces de représentations monomodaux sont entraînés sur des corpus de données exogènes puis ajustés en utilisant des réseaux de neurones profonds sur un corpus d’émissions françaises pour notre tâche de classification. Les expériences réalisées sur le corpus de données REPERE ont mis en évidence les gains d’une fusion au niveau des espaces de représentations par rapport aux méthodes de fusion tardive standard.

Détection de concepts pertinents pour le résumé automatique de conversations par recombinaison de patrons (Relevant concepts detection for the automatic summary of conversations using patterns recombination )
Jérémy Trione | Benoit Favre | Frederic Bechet
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)

automatique de conversations par recombinaison de patrons Jérémy Trione Benoit Favre Frédéric Béchet Aix-Marseille Université, CNRS, LIF UMR 7279, 13000, Marseille, France pré R ÉSUMÉ Ce papier décrit une approche pour créer des résumés de conversations parlées par remplissage de patrons. Les patrons sont générés automatiquement à partir de fragments généralisés depuis un corpus de résumés d’apprentissage. Les informations nécessaires pour remplir les patrons sont détectées dans les transcriptions des conversations et utilisées pour sélectionner les fragments candidats. L’approche obtient un score ROUGE-2 de 0.116 sur le corpus RATP-DECODA. Les résultats obtenus montrent que cette approche abstractive est plus performante que les approches extractives utilisées habituellement dans le domaine du résumé automatique.

SENSEI-LIF at SemEval-2016 Task 4: Polarity embedding fusion for robust sentiment analysis
Mickael Rouvier | Benoit Favre
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)


Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions
Florian Boudin | Hugo Mougard | Benoit Favre
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

Rapid FrameNet annotation of spoken conversation transcripts
Jeremy Trione | Frederic Bechet | Benoit Favre | Alexis Nasr
Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11)

Call Centre Conversation Summarization: A Pilot Task at Multiling 2015
Benoit Favre | Evgeny Stepanov | Jérémy Trione | Frédéric Béchet | Giuseppe Riccardi
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

MultiLing 2015: Multilingual Summarization of Single and Multi-Documents, On-line Fora, and Call-center Conversations
George Giannakopoulos | Jeff Kubina | John Conroy | Josef Steinberger | Benoit Favre | Mijail Kabadjov | Udo Kruschwitz | Massimo Poesio
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue


A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
Kai Hong | John Conroy | Benoit Favre | Alex Kulesza | Hui Lin | Ani Nenkova
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In the period since 2004, many novel sophisticated approaches for generic multi-document summarization have been developed. Intuitive simple approaches have also been shown to perform unexpectedly well for the task. Yet it is practically impossible to compare the existing approaches directly, because systems have been evaluated on different datasets, with different evaluation measures, against different sets of comparison systems. Here we present a corpus of summaries produced by several state-of-the-art extractive summarization systems or by popular baseline systems. The inputs come from the 2004 DUC evaluation, the latest year in which generic summarization was addressed in a shared task. We use the same settings for ROUGE automatic evaluation to compare the systems directly and analyze the statistical significance of the differences in performance. We show that in terms of average scores the state-of-the-art systems appear similar but that in fact they produce very different summaries. Our corpus will facilitate future research on generic summarization and motivates the need for development of more sensitive evaluation measures and for approaches to system combination in summarization.

Automatically enriching spoken corpora with syntactic information for linguistic studies
Alexis Nasr | Frederic Bechet | Benoit Favre | Thierry Bazillon | Jose Deulofeu | Andre Valli
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Syntactic parsing of speech transcriptions faces the problem of the presence of disfluencies that break the syntactic structure of the utterances. We propose in this paper two solutions to this problem. The first one relies on a disfluencies predictor that detects disfluencies and removes them prior to parsing. The second one integrates the disfluencies in the syntactic structure of the utterances and train a disfluencies aware parser.


Syntactic annotation of spontaneous speech: application to call-center conversation data
Thierry Bazillon | Melanie Deplano | Frederic Bechet | Alexis Nasr | Benoit Favre
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the syntactic annotation process of the DECODA corpus. This corpus contains manual transcriptions of spoken conversations recorded in the French call-center of the Paris Public Transport Authority (RATP). Three levels of syntactic annotation have been performed with a semi-supervised approach: POS tags, Syntactic Chunks and Dependency parses. The main idea is to use off-the-shelf NLP tools and models, originaly developped and trained on written text, to perform a first automatic annotation on the manually transcribed corpus. At the same time a fully manual annotation process is performed on a subset of the original corpus, called the GOLD corpus. An iterative process is then applied, consisting in manually correcting errors found in the automatic annotations, retraining the linguistic models of the NLP tools on this corrected corpus, then checking the quality of the adapted models on the fully manual annotations of the GOLD corpus. This process iterates until a certain error rate is reached. This paper describes this process, the main issues raising when adapting NLP tools to process speech transcriptions, and presents the first evaluations performed with these new adapted tools.

Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora
Fabrice Lefèvre | Djamel Mostefa | Laurent Besacier | Yannick Estève | Matthieu Quignard | Nathalie Camelin | Benoit Favre | Bassam Jabaian | Lina M. Rojas-Barahona
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The PORTMEDIA project is intended to develop new corpora for the evaluation of spoken language understanding systems. The newly collected data are in the field of human-machine dialogue systems for tourist information in French in line with the MEDIA corpus. Transcriptions and semantic annotations, obtained by low-cost procedures, are provided to allow a thorough evaluation of the systems' capabilities in terms of robustness and portability across languages and domains. A new test set with some adaptation data is prepared for each case: in Italian as an example of a new language, for ticket reservation as an example of a new domain. Finally the work is complemented by the proposition of a new high level semantic annotation scheme well-suited to dialogue data.

Percol0 - un système multimodal de détection de personnes dans des documents vidéo (Percol0 - A multimodal person detection system in video documents) [in French]
Frederic Bechet | Remi Auguste | Stephane Ayache | Delphine Charlet | Geraldine Damnati | Benoit Favre | Corinne Fredouille | Christophe Levy | Georges Linares | Jean Martinet
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

Robustesse et portabilités multilingue et multi-domaines des systèmes de compréhension de la parole : les corpus du projet PortMedia (Robustness and portability of spoken language understanding systems among languages and domains : the PORTMEDIA project) [in French]
Fabrice Lefèvre | Djamel Mostefa | Laurent Besacier | Yannick Estève | Matthieu Quignard | Nathalie Camelin | Benoit Favre | Bassam Jabaian | Lina Rojas-Barahona
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

Generative Constituent Parsing and Discriminative Dependency Reranking: Experiments on English and French
Joseph Le Roux | Benoît Favre | Alexis Nasr | Seyed Abolghasem Mirroshandel
Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages


Modèles génératif et discriminant en analyse syntaxique : expériences sur le corpus arboré de Paris 7 (Generative and discriminative models in parsing: experiments on the Paris 7 Treebank)
Joseph Le Roux | Benoît Favre | Seyed Abolghasem Mirroshandel | Alexis Nasr
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Nous présentons une architecture pour l’analyse syntaxique en deux étapes. Dans un premier temps un analyseur syntagmatique construit, pour chaque phrase, une liste d’analyses qui sont converties en arbres de dépendances. Ces arbres sont ensuite réévalués par un réordonnanceur discriminant. Cette méthode permet de prendre en compte des informations auxquelles l’analyseur n’a pas accès, en particulier des annotations fonctionnelles. Nous validons notre approche par une évaluation sur le corpus arboré de Paris 7. La seconde étape permet d’améliorer significativement la qualité des analyses retournées, quelle que soit la métrique utilisée.

<StuMaBa>: From Deep Representation to Surface
Bernd Bohnet | Simon Mille | Benoît Favre | Leo Wanner
Proceedings of the 13th European Workshop on Natural Language Generation

MACAON An NLP Tool Suite for Processing Word Lattices
Alexis Nasr | Frédéric Béchet | Jean-François Rey | Benoît Favre | Joseph Le Roux
Proceedings of the ACL-HLT 2011 System Demonstrations


The UMUS System for Named Entity Generation at GREC 2010
Benoit Favre | Bernd Bohnet
Proceedings of the 6th International Natural Language Generation Conference


pdf bib
A Scalable Global Model for Summarization
Dan Gillick | Benoit Favre
Proceedings of the Workshop on Integer Linear Programming for Natural Language Processing

ICSI-CRF: The Generation of References to the Main Subject and Named Entities Using Conditional Random Fields
Benoit Favre | Bernd Bohnet
Proceedings of the 2009 Workshop on Language Generation and Summarisation (UCNLG+Sum 2009)


Robust Named Entity Extraction from Large Spoken Archives
Benoît Favre | Frédéric Béchet | Pascal Nocéra
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing