@inproceedings{li-etal-2023-mixture,
    title = "Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models",
    author = "Li, Raymond  and
      Murray, Gabriel  and
      Carenini, Giuseppe",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2023.findings-emnlp.634/",
    doi = "10.18653/v1/2023.findings-emnlp.634",
    pages = "9456--9469",
    abstract = "In this work, we propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models in the parameter-efficient fine-tuning (PEFT) setting. In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture, where Gumbel-Softmax gates are used to determine the importance of these modules at each layer of the model. To reduce the number of parameters, we first train the model for a fixed small number of steps before pruning the experts based on their important scores. Our experiment results with three different pre-trained models show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters. In addition, we provide additional analysis to examine the experts selected by each model at each layer to provide insights for future studies."
}Markdown (Informal)
[Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models](https://preview.aclanthology.org/ingest-emnlp/2023.findings-emnlp.634/) (Li et al., Findings 2023)
ACL