Abstract
In weakly-supervised text classification, only label names act as sources of supervision. Predominant approaches to weakly-supervised text classification utilize a two-phase framework, where test samples are first assigned pseudo-labels and are then used to train a neural text classifier. In most previous work, the pseudo-labeling step is dependent on obtaining seed words that best capture the relevance of each class label. We present LIME, a framework for weakly-supervised text classification that entirely replaces the brittle seed-word generation process with entailment-based pseudo-classification. We find that combining weakly-supervised classification and textual entailment mitigates shortcomings of both, resulting in a more streamlined and effective classification pipeline. With just an off-the-shelf textual entailment model, LIME outperforms recent baselines in weakly-supervised text classification and achieves state-of-the-art in 4 benchmarks.- Anthology ID:
- 2022.coling-1.91
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 1083–1088
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.91
- DOI:
- Cite (ACL):
- Seongmin Park and Jihwa Lee. 2022. LIME: Weakly-Supervised Text Classification without Seeds. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1083–1088, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- LIME: Weakly-Supervised Text Classification without Seeds (Park & Lee, COLING 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.coling-1.91.pdf
- Code
- seongminp/lime
- Data
- AG News