Analyzing Pre-trained and Fine-tuned Language Models

Marius Mosbach

doi:10.18653/v1/2023.bigpicture-1.10

Analyzing Pre-trained and Fine-tuned Language Models

Abstract

Since the introduction of transformer-based language models in 2018, the current generation of natural language processing (NLP) models continues to demonstrate impressive capabilities on a variety of academic benchmarks and real-world applications. This progress is based on a simple but general pipeline which consists of pre-training neural language models on large quantities of text, followed by an adaptation step that fine-tunes the pre-trained model to perform a specific NLP task of interest. However, despite the impressive progress on academic benchmarks and the widespread deployment of pre-trained and fine-tuned language models in industry we still lack a fundamental understanding of how and why pre-trained and fine-tuned language models work as well as the individual steps of the pipeline that produce them. We makes several contributions towards improving our understanding of pre-trained and fine-tuned language models, ranging from analyzing the linguistic knowledge of pre-trained language models and how it is affected by fine-tuning, to a rigorous analysis of the fine-tuning process itself and how the choice of adaptation technique affects the generalization of models and thereby provide new insights about previously unexplained phenomena and the capabilities of pre-trained and fine-tuned language models.

Anthology ID:: 2023.bigpicture-1.10
Volume:: Proceedings of the Big Picture Workshop
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Yanai Elazar, Allyson Ettinger, Nora Kassner, Sebastian Ruder, Noah A. Smith
Venue:: BigPicture
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 123–134
Language:
URL:: https://aclanthology.org/2023.bigpicture-1.10
DOI:: 10.18653/v1/2023.bigpicture-1.10
Bibkey:
Cite (ACL):: Marius Mosbach. 2023. Analyzing Pre-trained and Fine-tuned Language Models. In Proceedings of the Big Picture Workshop, pages 123–134, Singapore. Association for Computational Linguistics.
Cite (Informal):: Analyzing Pre-trained and Fine-tuned Language Models (Mosbach, BigPicture 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2023.bigpicture-1.10.pdf

PDF Search