Generate then Refine: Data Augmentation for Zero-shot Intent Detection

I-Fan Lin; Faegheh Hasibi; Suzan Verberne

doi:10.18653/v1/2024.findings-emnlp.768

Generate then Refine: Data Augmentation for Zero-shot Intent Detection

I-Fan Lin, Faegheh Hasibi, Suzan Verberne

Abstract

In this short paper we propose a data augmentation method for intent detection in zero-resource domains.Existing data augmentation methods rely on few labelled examples for each intent category, which can be expensive in settings with many possible intents.We use a two-stage approach: First, we generate utterances for intent labels using an open-source large language model in a zero-shot setting. Second, we develop a smaller sequence-to-sequence model (the Refiner), to improve the generated utterances. The Refiner is fine-tuned on seen domains and then applied to unseen domains. We evaluate our method by training an intent classifier on the generated data, and evaluating it on real (human) data.We find that the Refiner significantly improves the data utility and diversity over the zero-shot LLM baseline for unseen domains and over common baseline approaches.Our results indicate that a two-step approach of a generative LLM in zero-shot setting and a smaller sequence-to-sequence model can provide high-quality data for intent detection.

Anthology ID:: 2024.findings-emnlp.768
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13138–13146
Language:
URL:: https://preview.aclanthology.org/Add-Cong-Liu-Florida-Atlantic-University-author-id/2024.findings-emnlp.768/
DOI:: 10.18653/v1/2024.findings-emnlp.768
Bibkey:
Cite (ACL):: I-Fan Lin, Faegheh Hasibi, and Suzan Verberne. 2024. Generate then Refine: Data Augmentation for Zero-shot Intent Detection. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 13138–13146, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Generate then Refine: Data Augmentation for Zero-shot Intent Detection (Lin et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/Add-Cong-Liu-Florida-Atlantic-University-author-id/2024.findings-emnlp.768.pdf

PDF Cite Search Fix data