Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos, Arthur Brazinskas, Jakub Adamek, Jonathan Mallinson, Aliaksei Severyn, Eric Malmi
Abstract
Despite the impressive performance of large language models (LLMs), theyoften lag behind specialized models in various tasks. LLMs only use a fractionof the existing training data for in-context learning, while task-specificmodels harness the full dataset for fine-tuning. In this work, we tackle theproblem of leveraging training data to improve the performance of LLMs withoutfine-tuning. Our approach directly targets LLM predictions without requiringaccess to their weights. We create a pool of candidates from the LLM throughfew-shot prompting and we employ a compact model, the LM-corrector (LMCor),specifically trained to merge these candidates to produce an enhanced output.Our experiments on four natural language generation tasks demonstrate that evena small LMCor model (250M) substantially improves the few-shot performance ofLLMs (62B), matching and even outperforming standard fine-tuning. Furthermore,we illustrate the robustness of LMCor against different prompts, therebyminimizing the need for extensive prompt engineering. Finally, we show thatLMCor can be seamlessly integrated with different LLMs at inference, serving asa plug-and-play module to improve their performance.- Anthology ID:
- 2024.eacl-long.165
- Volume:
- Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2703–2718
- Language:
- URL:
- https://aclanthology.org/2024.eacl-long.165
- DOI:
- Cite (ACL):
- Giorgos Vernikos, Arthur Brazinskas, Jakub Adamek, Jonathan Mallinson, Aliaksei Severyn, and Eric Malmi. 2024. Small Language Models Improve Giants by Rewriting Their Outputs. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2703–2718, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Small Language Models Improve Giants by Rewriting Their Outputs (Vernikos et al., EACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2024.eacl-long.165.pdf