Cyprien de Lichy
2023
GAN-LM: Generative Adversarial Network using Language Models for Downstream Applications
Dae Yon Hwang
|
Yaroslav Nechaev
|
Cyprien de Lichy
|
Renxian Zhang
Proceedings of the 16th International Natural Language Generation Conference
In this work, we investigate Data Augmentation methods to improve the performance of state-of-the-art models for four different downstream tasks. Specifically, we propose Generative Adversarial Network using Language Models (GAN-LM) approach that combines a deep generative model with a pre-trained language model to produce diverse augmentations. We compare the GAN-LM to various conventional methods in non-contextual- and contextual-levels on four public datasets: ZESHEL for zero-shot entity linking, TREC for question classification, STS-B for sentence pairs semantic textual similarity (STS), and mSTS for multilingual sentence pairs STS. Additionally, we subsample these datasets to study the impact of such augmentations in low-resource settings where limited amounts of training data is available. Compared to the state-of-the-art methods in downstream tasks, we mostly achieve the best performance using GAN-LM approach. Finally, we investigate the way of combining the GAN-LM with other augmentation methods to complement our proposed approach. The developed code for reproducibility is included in the supplementary material.
2021
Meta-Learning for Few-Shot Named Entity Recognition
Cyprien de Lichy
|
Hadrien Glaude
|
William Campbell
Proceedings of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing
Meta-learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples. However, applications to structured prediction and textual tasks pose challenges for meta-learning algorithms. In this paper, we apply two meta-learning algorithms, Prototypical Networks and Reptile, to few-shot Named Entity Recognition (NER), including a method for incorporating language model pre-training and Conditional Random Fields (CRF). We propose a task generation scheme for converting classical NER datasets into the few-shot setting, for both training and evaluation. Using three public datasets, we show these meta-learning algorithms outperform a reasonable fine-tuned BERT baseline. In addition, we propose a novel combination of Prototypical Networks and Reptile.
2019
A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification
Varun Kumar
|
Hadrien Glaude
|
Cyprien de Lichy
|
Wlliam Campbell
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
New conversation topics and functionalities are constantly being added to conversational AI agents like Amazon Alexa and Apple Siri. As data collection and annotation is not scalable and is often costly, only a handful of examples for the new functionalities are available, which results in poor generalization performance. We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new intent. In this paper, we study six feature space data augmentation methods to improve classification performance in FSI setting in combination with both supervised and unsupervised representation learning methods such as BERT. Through realistic experiments on two public conversational datasets, SNIPS, and the Facebook Dialog corpus, we show that data augmentation in feature space provides an effective way to improve intent classification performance in few-shot setting beyond traditional transfer learning approaches. In particular, we show that (a) upsampling in latent space is a competitive baseline for feature space augmentation (b) adding the difference between two examples to a new example is a simple yet effective data augmentation method.
Search
Co-authors
- Dae Yon Hwang 1
- Hadrien Glaude 2
- Renxian Zhang 1
- Varun Kumar 1
- William Campbell 1
- show all...