Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models

Raymond Li; Gabriel Murray; Giuseppe Carenini

doi:10.18653/v1/2023.findings-emnlp.634

Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models

Raymond Li, Gabriel Murray, Giuseppe Carenini

Abstract

In this work, we propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models in the parameter-efficient fine-tuning (PEFT) setting. In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture, where Gumbel-Softmax gates are used to determine the importance of these modules at each layer of the model. To reduce the number of parameters, we first train the model for a fixed small number of steps before pruning the experts based on their important scores. Our experiment results with three different pre-trained models show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters. In addition, we provide additional analysis to examine the experts selected by each model at each layer to provide insights for future studies.

Anthology ID:: 2023.findings-emnlp.634
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9456–9469
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2023.findings-emnlp.634/
DOI:: 10.18653/v1/2023.findings-emnlp.634
Bibkey:
Cite (ACL):: Raymond Li, Gabriel Murray, and Giuseppe Carenini. 2023. Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9456–9469, Singapore. Association for Computational Linguistics.
Cite (Informal):: Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models (Li et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2023.findings-emnlp.634.pdf

PDF Cite Search Fix data