Abstract
Semi-supervised learning is an efficient way to improve performance for natural language processing systems. In this work, we propose Para-SSL, a scheme to generate candidate utterances using paraphrasing and methods from semi-supervised learning. In order to perform paraphrase generation in the context of a dialog system, we automatically extract paraphrase pairs to create a paraphrase corpus. Using this data, we build a paraphrase generation system and perform one-to-many generation, followed by a validation step to select only the utterances with good quality. The paraphrase-based semi-supervised learning is applied to five functionalities in a natural language understanding system. Our proposed method for semi-supervised learning using paraphrase generation does not require user utterances and can be applied prior to releasing a new functionality to a system. Experiments show that we can achieve up to 19% of relative slot error reduction without an access to user utterances, and up to 35% when leveraging live traffic utterances.- Anthology ID:
- W19-2306
- Volume:
- Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Antoine Bosselut, Asli Celikyilmaz, Marjan Ghazvininejad, Srinivasan Iyer, Urvashi Khandelwal, Hannah Rashkin, Thomas Wolf
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 45–54
- Language:
- URL:
- https://aclanthology.org/W19-2306
- DOI:
- 10.18653/v1/W19-2306
- Cite (ACL):
- Eunah Cho, He Xie, and William M. Campbell. 2019. Paraphrase Generation for Semi-Supervised Learning in NLU. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation, pages 45–54, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Paraphrase Generation for Semi-Supervised Learning in NLU (Cho et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/W19-2306.pdf