Alyssa Chvasta
2022
Lost in Distillation: A Case Study in Toxicity Modeling
Alyssa Chvasta
|
Alyssa Lees
|
Jeffrey Sorensen
|
Lucy Vasserman
|
Nitesh Goyal
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one. In particular, distillation is of tremendous benefit when it comes to real-world constraints such as serving latency or serving at scale. However, a loss of robustness in language understanding may be hidden in the process and not immediately revealed when looking at high-level evaluation metrics. In this work, we investigate the hidden costs: what is “lost in distillation”, especially in regards to identity-based model bias using the case study of toxicity modeling. With reproducible models using open source training sets, we investigate models distilled from a BERT teacher baseline. Using both open source and proprietary big data models, we investigate these hidden performance costs.