Abstract
Generative classifiers offer potential advantages over their discriminative counterparts, namely in the areas of data efficiency, robustness to data shift and adversarial examples, and zero-shot learning (Ng and Jordan,2002; Yogatama et al., 2017; Lewis and Fan,2019). In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several graphical model configurations. We parameterize the distributions using standard neural architectures used in conditional language modeling and perform learning by directly maximizing the log marginal likelihood via gradient-based optimization, which avoids the need to do expectation-maximization. We empirically characterize the performance of our models on six text classification datasets. The choice of where to include the latent variable has a significant impact on performance, with the strongest results obtained when using the latent variable as an auxiliary conditioning variable in the generation of the textual input. This model consistently outperforms both the generative and discriminative classifiers in small-data settings. We analyze our model by finding that the latent variable captures interpretable properties of the data, even with very small training sets.- Anthology ID:
- D19-1048
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 507–517
- Language:
- URL:
- https://aclanthology.org/D19-1048
- DOI:
- 10.18653/v1/D19-1048
- Cite (ACL):
- Xiaoan Ding and Kevin Gimpel. 2019. Latent-Variable Generative Models for Data-Efficient Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 507–517, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Latent-Variable Generative Models for Data-Efficient Text Classification (Ding & Gimpel, EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/landing_page/D19-1048.pdf
- Data
- AG News, Yelp Review Polarity