Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation
Justine Winkler, Simon Brugman, Bas van Berkel, Martha Larson
Abstract
We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.- Anthology ID:
- 2021.ecnlp-1.16
- Volume:
- Proceedings of the 4th Workshop on e-Commerce and NLP
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venue:
- ECNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 132–139
- Language:
- URL:
- https://aclanthology.org/2021.ecnlp-1.16
- DOI:
- 10.18653/v1/2021.ecnlp-1.16
- Cite (ACL):
- Justine Winkler, Simon Brugman, Bas van Berkel, and Martha Larson. 2021. Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation. In Proceedings of the 4th Workshop on e-Commerce and NLP, pages 132–139, Online. Association for Computational Linguistics.
- Cite (Informal):
- Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation (Winkler et al., ECNLP 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.ecnlp-1.16.pdf