Emilie Colin


2019

pdf
Generating Text from Anonymised Structures
Emilie Colin | Claire Gardent
Proceedings of the 12th International Conference on Natural Language Generation

Surface realisation (SR) consists in generating a text from a meaning representations (MR). In this paper, we introduce a new parallel dataset of deep meaning representations (MR) and French sentences and we present a novel method for MR-to-text generation which seeks to generalise by abstracting away from lexical content. Most current work on natural language generation focuses on generating text that matches a reference using BLEU as evaluation criteria. In this paper, we additionally consider the model’s ability to reintroduce the function words that are absent from the deep input meaning representations. We show that our approach increases both BLEU score and the scores used to assess function words generation.

2018

pdf
Generating Syntactic Paraphrases
Emilie Colin | Claire Gardent
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We study the automatic generation of syntactic paraphrases using four different models for generation: data-to-text generation, text-to-text generation, text reduction and text expansion, We derive training data for each of these tasks from the WebNLG dataset and we show (i) that conditioning generation on syntactic constraints effectively permits the generation of syntactically distinct paraphrases for the same input and (ii) that exploiting different types of input (data, text or data+text) further increases the number of distinct paraphrases that can be generated for a given input.

2017

pdf
Création automatique d’une grammaire syntaxico-sémantique (Syntactic-semantic grammar automatic creation)
Emilie Colin
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. 19es REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017)

Nous proposons une nouvelle méthode pour la création automatique de grammaires lexicalisées syntaxico-sémantiques. A l’heure actuelle, la création de grammaire résulte soit d’un travail manuel soit d’un traitement automatisé de corpus arboré. Notre proposition est d’extraire à partir de données VerbNet une grammaire noyau (formes canoniques des verbes et des groupes nominaux) de l’anglais intégrant une sémantique VerbNet. Notre objectif est de profiter des larges ressources existantes pour produire un système de génération de texte symbolique de qualité en domaine restreint.

2016

pdf
The WebNLG Challenge: Generating Text from DBPedia Data
Emilie Colin | Claire Gardent | Yassine M’rabet | Shashi Narayan | Laura Perez-Beltrachini
Proceedings of the 9th International Natural Language Generation conference