Abstract
Large language models (LLMs) are increasingly used for applications beyond text generation, ranging from text summarization to instruction following. One popular example of exploiting LLMs’ zero- and few-shot capabilities is the task of text classification. This short paper compares two popular LLM-based classification pipelines (GPT-4 and LLAMA 2) to a popular pre-LLM-era classification pipeline on the task of news trustworthiness classification, focusing on performance, training, and deployment requirements. We find that, in this case, the pre-LLM-era ensemble pipeline outperforms the two popular LLM pipelines while being orders of magnitude smaller in parameter size.- Anthology ID:
- 2024.trustnlp-1.5
- Volume:
- Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Anaelia Ovalle, Kai-Wei Chang, Yang Trista Cao, Ninareh Mehrabi, Jieyu Zhao, Aram Galstyan, Jwala Dhamala, Anoop Kumar, Rahul Gupta
- Venues:
- TrustNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 51–60
- Language:
- URL:
- https://aclanthology.org/2024.trustnlp-1.5
- DOI:
- 10.18653/v1/2024.trustnlp-1.5
- Cite (ACL):
- Matyas Bohacek and Michal Bravansky. 2024. When XGBoost Outperforms GPT-4 on Text Classification: A Case Study. In Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024), pages 51–60, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- When XGBoost Outperforms GPT-4 on Text Classification: A Case Study (Bohacek & Bravansky, TrustNLP-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.trustnlp-1.5.pdf