Why Generate When You Can Discriminate? A Novel Technique for Text Classification using Language Models

Sachin Pawar; Nitin Ramrakhiyani; Anubhav Sinha; Manoj Apte; Girish Palshikar

Why Generate When You Can Discriminate? A Novel Technique for Text Classification using Language Models

Sachin Pawar, Nitin Ramrakhiyani, Anubhav Sinha, Manoj Apte, Girish Palshikar

Abstract

In this paper, we propose a novel two-step technique for text classification using autoregressive Language Models (LM). In the first step, a set of perplexity and log-likelihood based numeric features are elicited from an LM for a text instance to be classified. Then, in the second step, a classifier based on these features is trained to predict the final label. The classifier used is usually a simple machine learning classifier like Support Vector Machine (SVM) or Logistic Regression (LR) and it is trained using a small set of training examples. We believe, our technique presents a whole new way of exploiting the available training instances, in addition to the existing ways like fine-tuning LMs or in-context learning. Our approach stands out by eliminating the need for parameter updates in LMs, as required in fine-tuning, and does not impose limitations on the number of training examples faced while building prompts for in-context learning. We evaluate our technique across 5 different datasets and compare with multiple competent baselines.

Anthology ID:: 2024.findings-eacl.74
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1099–1114
Language:
URL:: https://aclanthology.org/2024.findings-eacl.74
DOI:
Bibkey:
Cite (ACL):: Sachin Pawar, Nitin Ramrakhiyani, Anubhav Sinha, Manoj Apte, and Girish Palshikar. 2024. Why Generate When You Can Discriminate? A Novel Technique for Text Classification using Language Models. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1099–1114, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Why Generate When You Can Discriminate? A Novel Technique for Text Classification using Language Models (Pawar et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp-22-attachments/2024.findings-eacl.74.pdf

PDF Search