When XGBoost Outperforms GPT-4 on Text Classification: A Case Study

Matyas Bohacek, Michal Bravansky


Abstract
Large language models (LLMs) are increasingly used for applications beyond text generation, ranging from text summarization to instruction following. One popular example of exploiting LLMs’ zero- and few-shot capabilities is the task of text classification. This short paper compares two popular LLM-based classification pipelines (GPT-4 and LLAMA 2) to a popular pre-LLM-era classification pipeline on the task of news trustworthiness classification, focusing on performance, training, and deployment requirements. We find that, in this case, the pre-LLM-era ensemble pipeline outperforms the two popular LLM pipelines while being orders of magnitude smaller in parameter size.
Anthology ID:
2024.trustnlp-1.5
Volume:
Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kai-Wei Chang, Anaelia Ovalle, Jieyu Zhao, Yang Trista Cao, Ninareh Mehrabi, Aram Galstyan, Jwala Dhamala, Anoop Kumar, Rahul Gupta
Venues:
TrustNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/2024.trustnlp-1.5
DOI:
Bibkey:
Cite (ACL):
Matyas Bohacek and Michal Bravansky. 2024. When XGBoost Outperforms GPT-4 on Text Classification: A Case Study. In Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024), pages 51–60, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
When XGBoost Outperforms GPT-4 on Text Classification: A Case Study (Bohacek & Bravansky, TrustNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.trustnlp-1.5.pdf
Supplementary material:
 2024.trustnlp-1.5.SupplementaryMaterial.zip