Abstract
Many text classification tasks require handling unseen domains with plenty of unlabeled data, thus giving rise to the self-adaption or the so-called transductive zero-shot learning (TZSL) problem. However, current methods based solely on encoders or decoders overlook the possibility that these two modules may promote each other. As a first effort to bridge this gap, we propose an autoencoder named ZeroAE. Specifically, the text is encoded with two separate BERT-based encoders into two disentangled spaces, i.e., label-relevant (for classification) and label-irrelevant respectively. The two latent spaces are then decoded by prompting GPT-2 to recover the text as well as to further generate text with labels in the unseen domains to train the encoder in turn. To better exploit the unlabeled data, a novel indirect uncertainty-aware sampling (IUAS) approach is proposed to train ZeroAE. Extensive experiments show that ZeroAE largely surpasses the SOTA methods by 15.93% and 8.70% on average respectively in the label-partially-unseen and label-fully-unseen scenario. Notably, the label-fully-unseen ZeroAE even possesses superior performance to the label-partially-unseen SOTA methods.- Anthology ID:
- 2023.findings-acl.200
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3202–3219
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.200
- DOI:
- 10.18653/v1/2023.findings-acl.200
- Cite (ACL):
- Kaihao Guo, Hang Yu, Cong Liao, Jianguo Li, and Haipeng Zhang. 2023. ZeroAE: Pre-trained Language Model based Autoencoder for Transductive Zero-shot Text Classification. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3202–3219, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- ZeroAE: Pre-trained Language Model based Autoencoder for Transductive Zero-shot Text Classification (Guo et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2023.findings-acl.200.pdf