2023
pdf
abs
Regression-Free Model Updates for Spoken Language Understanding
Andrea Caciolai
|
Verena Weber
|
Tobias Falke
|
Alessandro Pedrani
|
Davide Bernardi
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
In real-world systems, an important requirement for model updates is to avoid regressions in user experience caused by flips of previously correct classifications to incorrect ones. Multiple techniques for that have been proposed in the recent literature. In this paper, we apply one such technique, focal distillation, to model updates in a goal-oriented dialog system and assess its usefulness in practice. In particular, we evaluate its effectiveness for key language understanding tasks, including sentence classification and sequence labeling tasks, we further assess its effect when applied to repeated model updates over time, and test its compatibility with mislabeled data. Our experiments on a public benchmark and data from a deployed dialog system demonstrate that focal distillation can substantially reduce regressions, at only minor drops in accuracy, and that it further outperforms naive supervised training in challenging mislabeled data and label expansion settings.
2022
pdf
abs
Semi-supervised Adversarial Text Generation based on Seq2Seq models
Hieu Le
|
Dieu-thu Le
|
Verena Weber
|
Chris Church
|
Kay Rottmann
|
Melanie Bradford
|
Peter Chin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
To improve deep learning models’ robustness, adversarial training has been frequently used in computer vision with satisfying results. However, adversarial perturbation on text have turned out to be more challenging due to the discrete nature of text. The generated adversarial text might not sound natural or does not preserve semantics, which is the key for real world applications where text classification is based on semantic meaning. In this paper, we describe a new way for generating adversarial samples by using pseudo-labeled in-domain text data to train a seq2seq model for adversarial generation and combine it with paraphrase detection. We showcase the benefit of our approach for a real-world Natural Language Understanding (NLU) task, which maps a user’s request to an intent. Furthermore, we experiment with gradient-based training for the NLU task and try using token importance scores to guide the adversarial text generation. We show that our approach can generate realistic and relevant adversarial samples compared to other state-of-the-art adversarial training methods. Applying adversarial training using these generated samples helps the NLU model to recover up to 70% of these types of errors and makes the model more robust, especially in the tail distribution in a large scale real world application.
2021
pdf
bib
abs
It is better to Verify: Semi-Supervised Learning with a human in the loop for large-scale NLU models
Verena Weber
|
Enrico Piovano
|
Melanie Bradford
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances
When a NLU model is updated, new utter- ances must be annotated to be included for training. However, manual annotation is very costly. We evaluate a semi-supervised learning workflow with a human in the loop in a produc- tion environment. The previous NLU model predicts the annotation of the new utterances, a human then reviews the predicted annotation. Only when the NLU prediction is assessed as incorrect the utterance is sent for human anno- tation. Experimental results show that the pro- posed workflow boosts the performance of the NLU model while significantly reducing the annotation volume. Specifically, in our setup, we see improvements of up to 14.16% for a recall-based metric and up to 9.57% for a F1- score based metric, while reducing the annota- tion volume by 97% and overall cost by 60% for each iteration.
pdf
abs
Combining semantic search and twin product classification for recognition of purchasable items in voice shopping
Dieu-Thu Le
|
Verena Weber
|
Melanie Bradford
Proceedings of the 4th Workshop on e-Commerce and NLP
The accuracy of an online shopping system via voice commands is particularly important and may have a great impact on customer trust. This paper focuses on the problem of detecting if an utterance contains actual and purchasable products, thus referring to a shopping-related intent in a typical Spoken Language Understanding architecture consist- ing of an intent classifier and a slot detec- tor. Searching through billions of products to check if a detected slot is a purchasable item is prohibitively expensive. To overcome this problem, we present a framework that (1) uses a retrieval module that returns the most rele- vant products with respect to the detected slot, and (2) combines it with a twin network that decides if the detected slot is indeed a pur- chasable item or not. Through various exper- iments, we show that this architecture outper- forms a typical slot detector approach, with a gain of +81% in accuracy and +41% in F1 score.