CRUISE: Cold-Start New Skill Development via Iterative Utterance Generation

Yilin Shen, Avik Ray, Abhishek Patel, Hongxia Jin


Abstract
We present a system, CRUISE, that guides ordinary software developers to build a high quality natural language understanding (NLU) engine from scratch. This is the fundamental step of building a new skill in personal assistants. Unlike existing solutions that require either developers or crowdsourcing to manually generate and annotate a large number of utterances, we design a hybrid rule-based and data-driven approach with the capability to iteratively generate more and more utterances. Our system only requires light human workload to iteratively prune incorrect utterances. CRUISE outputs a well trained NLU engine and a large scale annotated utterance corpus that third parties can use to develop their custom skills. Using both benchmark dataset and custom datasets we collected in real-world settings, we validate the high quality of CRUISE generated utterances via both competitive NLU performance and human evaluation. We also show the largely reduced human workload in terms of both cognitive load and human pruning time consumption.
Anthology ID:
P18-4018
Volume:
Proceedings of ACL 2018, System Demonstrations
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Fei Liu, Thamar Solorio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
105–110
Language:
URL:
https://aclanthology.org/P18-4018
DOI:
10.18653/v1/P18-4018
Bibkey:
Cite (ACL):
Yilin Shen, Avik Ray, Abhishek Patel, and Hongxia Jin. 2018. CRUISE: Cold-Start New Skill Development via Iterative Utterance Generation. In Proceedings of ACL 2018, System Demonstrations, pages 105–110, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
CRUISE: Cold-Start New Skill Development via Iterative Utterance Generation (Shen et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/P18-4018.pdf
Data
ATIS