Abstract
Fine-tuning is a widely used technique for leveraging pre-trained language models (PLMs) in downstream tasks, but it can be computationally expensive and storage-intensive. To address this challenge, researchers have developed parameter-efficient methods that balance performance and resource cost. However, these methods often come with trade-offs like increased inference latency, token length usage, or limited adaptability for multitasking scenarios. This paper introduces a novel parameter-efficient method called DimA(Dimensionality Augmentation), which enhances the Transformer architecture by increasing the dimensionality. DimA achieves state-of-the-art results in GLUE and XSUM tasks while utilizing less than 1% of the original model’s parameters. Moreover, DimA introduces a novel approach to knowledge transfer that enables the simultaneous utilization of knowledge learned from multiple tasks to handle new tasks. This method significantly enhances the performance of the model on new tasks. Its versatility in model structure also enables its application to various Transformer-based models.- Anthology ID:
- 2024.lrec-main.441
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 4922–4934
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.441
- DOI:
- Cite (ACL):
- Wenxuan Zhang, Min Huang, Zhuoyang Song, and Qinghai Miao. 2024. DimA: A Parameter-efficient Fine-tuning Method with Knowledge Transfer Based on Transformer. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4922–4934, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- DimA: A Parameter-efficient Fine-tuning Method with Knowledge Transfer Based on Transformer (Zhang et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2024.lrec-main.441.pdf