Abstract
Since the introduction of transformer-based language models in 2018, the current generation of natural language processing (NLP) models continues to demonstrate impressive capabilities on a variety of academic benchmarks and real-world applications. This progress is based on a simple but general pipeline which consists of pre-training neural language models on large quantities of text, followed by an adaptation step that fine-tunes the pre-trained model to perform a specific NLP task of interest. However, despite the impressive progress on academic benchmarks and the widespread deployment of pre-trained and fine-tuned language models in industry we still lack a fundamental understanding of how and why pre-trained and fine-tuned language models work as well as the individual steps of the pipeline that produce them. We makes several contributions towards improving our understanding of pre-trained and fine-tuned language models, ranging from analyzing the linguistic knowledge of pre-trained language models and how it is affected by fine-tuning, to a rigorous analysis of the fine-tuning process itself and how the choice of adaptation technique affects the generalization of models and thereby provide new insights about previously unexplained phenomena and the capabilities of pre-trained and fine-tuned language models.- Anthology ID:
- 2023.bigpicture-1.10
- Volume:
- Proceedings of the Big Picture Workshop
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Yanai Elazar, Allyson Ettinger, Nora Kassner, Sebastian Ruder, Noah A. Smith
- Venue:
- BigPicture
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 123–134
- Language:
- URL:
- https://aclanthology.org/2023.bigpicture-1.10
- DOI:
- 10.18653/v1/2023.bigpicture-1.10
- Cite (ACL):
- Marius Mosbach. 2023. Analyzing Pre-trained and Fine-tuned Language Models. In Proceedings of the Big Picture Workshop, pages 123–134, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Analyzing Pre-trained and Fine-tuned Language Models (Mosbach, BigPicture 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.bigpicture-1.10.pdf