Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification
Mingkuan Liu, Musen Wen, Selcuk Kopru, Xianjing Liu, Alan Lu
Abstract
The lack of high-quality labeled training data has been one of the critical challenges facing many industrial machine learning tasks. To tackle this challenge, in this paper, we propose a semi-supervised learning method to utilize unlabeled data and user feedback signals to improve the performance of ML models. The method employs a primary model Main and an auxiliary evaluation model Eval, where Main and Eval models are trained iteratively by automatically generating labeled data from unlabeled data and/or users’ feedback signals. The proposed approach is applied to different text classification tasks. We report results on both the publicly available Yahoo! Answers dataset and our e-commerce product classification dataset. The experimental results show that the proposed method reduces the classification error rate by 4% and up to 15% across various experimental setups and datasets. A detailed comparison with other semi-supervised learning approaches is also presented later in the paper. The results from various text classification tasks demonstrate that our method outperforms those developed in previous related studies.- Anthology ID:
- W18-3409
- Volume:
- Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne
- Editors:
- Reza Haffari, Colin Cherry, George Foster, Shahram Khadivi, Bahar Salehi
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 68–76
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/W18-3409/
- DOI:
- 10.18653/v1/W18-3409
- Cite (ACL):
- Mingkuan Liu, Musen Wen, Selcuk Kopru, Xianjing Liu, and Alan Lu. 2018. Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, pages 68–76, Melbourne. Association for Computational Linguistics.
- Cite (Informal):
- Semi-Supervised Learning with Auxiliary Evaluation Component for Large Scale e-Commerce Text Classification (Liu et al., ACL 2018)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/W18-3409.pdf
- Data
- Yahoo! Answers