Building a Production Model for Retrieval-Based Chatbots

Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei


Abstract
Response suggestion is an important task for building human-computer conversation systems. Recent approaches to conversation modeling have introduced new model architectures with impressive results, but relatively little attention has been paid to whether these models would be practical in a production setting. In this paper, we describe the unique challenges of building a production retrieval-based conversation system, which selects outputs from a whitelist of candidate responses. To address these challenges, we propose a dual encoder architecture which performs rapid inference and scales well with the size of the whitelist. We also introduce and compare two methods for generating whitelists, and we carry out a comprehensive analysis of the model and whitelists. Experimental results on a large, proprietary help desk chat dataset, including both offline metrics and a human evaluation, indicate production-quality performance and illustrate key lessons about conversation modeling in practice.
Anthology ID:
W19-4104
Original:
W19-4104v1
Version 2:
W19-4104v2
Volume:
Proceedings of the First Workshop on NLP for Conversational AI
Month:
August
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32–41
Language:
URL:
https://aclanthology.org/W19-4104
DOI:
10.18653/v1/W19-4104
Bibkey:
Cite (ACL):
Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, and Tao Lei. 2019. Building a Production Model for Retrieval-Based Chatbots. In Proceedings of the First Workshop on NLP for Conversational AI, pages 32–41, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Building a Production Model for Retrieval-Based Chatbots (Swanson et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W19-4104.pdf