Bas van Berkel


2021

pdf
Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation
Justine Winkler | Simon Brugman | Bas van Berkel | Martha Larson
Proceedings of the 4th Workshop on e-Commerce and NLP

We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.