Towards Need-Based Spoken Language Understanding Model Updates: What Have We Learned?

Quynh Do, Judith Gaspers, Daniil Sorokin, Patrick Lehnen


Abstract
In productionized machine learning systems, online model performance is known to deteriorate over time when there is a distributional drift between offline training and online application data. As a remedy, models are typically retrained at fixed time intervals, implying high computational and manual costs. This work aims at decreasing such costs in productionized, large-scale Spoken Language Understanding systems. In particular, we develop a need-based re-training strategy guided by an efficient drift detector and discuss the arising challenges including system complexity, overlapping model releases, observation limitation and the absence of annotated resources at runtime. We present empirical results on historical data and confirm the utility of our design decisions via an online A/B experiment.
Anthology ID:
2022.emnlp-industry.11
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
Yunyao Li, Angeliki Lazaridou
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
121–127
Language:
URL:
https://aclanthology.org/2022.emnlp-industry.11
DOI:
10.18653/v1/2022.emnlp-industry.11
Bibkey:
Cite (ACL):
Quynh Do, Judith Gaspers, Daniil Sorokin, and Patrick Lehnen. 2022. Towards Need-Based Spoken Language Understanding Model Updates: What Have We Learned?. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 121–127, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Towards Need-Based Spoken Language Understanding Model Updates: What Have We Learned? (Do et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2022.emnlp-industry.11.pdf