pNLP-Mixer: an Efficient all-MLP Architecture for Language

Francesco Fusco; Damian Pascual; Peter Staar; Diego Antognini

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Francesco Fusco, Damian Pascual, Peter Staar, Diego Antognini

Abstract

Large pre-trained language models based on transformer architectureƒhave drastically changed the natural language processing (NLP) landscape. However, deploying those models for on-device applications in constrained devices such as smart watches is completely impractical due to their size and inference cost. As an alternative to transformer-based architectures, recent work on efficient NLP has shown that weight-efficient models can attain competitive performance for simple tasks, such as slot filling and intent classification, with model sizes in the order of the megabyte. This work introduces the pNLP-Mixer architecture, an embedding-free MLP-Mixer model for on-device NLP that achieves high weight-efficiency thanks to a novel projection layer. We evaluate a pNLP-Mixer model of only one megabyte in size on two multi-lingual semantic parsing datasets, MTOP and multiATIS. Our quantized model achieves 99.4% and 97.8% the performance of mBERT on MTOP and multiATIS, while using 170x less parameters. Our model consistently beats the state-of-the-art of tiny models (pQRNN), which is twice as large, by a margin up to 7.8% on MTOP.

Anthology ID:: 2023.acl-industry.6
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 53–60
Language:
URL:: https://aclanthology.org/2023.acl-industry.6
DOI:
Bibkey:
Cite (ACL):: Francesco Fusco, Damian Pascual, Peter Staar, and Diego Antognini. 2023. pNLP-Mixer: an Efficient all-MLP Architecture for Language. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 53–60, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: pNLP-Mixer: an Efficient all-MLP Architecture for Language (Fusco et al., ACL 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/starsem-semeval-split/2023.acl-industry.6.pdf

PDF Search