Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work

Cheng-Han Chiang, Yung-Sung Chuang, Hung-yi Lee


Abstract
Pre-trained language models (PLMs) are language models that are pre-trained on large-scaled corpora in a self-supervised fashion. These PLMs have fundamentally changed the natural language processing community in the past few years. In this tutorial, we aim to provide a broad and comprehensive introduction from two perspectives: why those PLMs work, and how to use them in NLP tasks. The first part of the tutorial shows some insightful analysis on PLMs that partially explain their exceptional downstream performance. The second part first focuses on emerging pre-training methods that enable PLMs to perform diverse downstream tasks and then illustrates how one can apply those PLMs to downstream tasks under different circumstances. These circumstances include fine-tuning PLMs when under data scarcity, and using PLMs with parameter efficiency. We believe that attendees of different backgrounds would find this tutorial informative and useful.
Anthology ID:
2022.aacl-tutorials.2
Volume:
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Tutorial Abstracts
Month:
November
Year:
2022
Address:
Taipei
Editors:
Miguel A. Alonso, Zhongyu Wei
Venues:
AACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8–15
Language:
URL:
https://aclanthology.org/2022.aacl-tutorials.2
DOI:
Bibkey:
Cite (ACL):
Cheng-Han Chiang, Yung-Sung Chuang, and Hung-yi Lee. 2022. Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Tutorial Abstracts, pages 8–15, Taipei. Association for Computational Linguistics.
Cite (Informal):
Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work (Chiang et al., AACL-IJCNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/2022.aacl-tutorials.2.pdf