How Well Can a Genetic Algorithm Fine-tune Transformer Encoders? A First Approach

Vicente Ivan Sanchez Carmona, Shanshan Jiang, Bin Dong


Abstract
Genetic Algorithms (GAs) have been studied across different fields such as engineering or medicine to optimize diverse problems such as network routing, or medical image segmentation. Moreover, they have been used to automatically find optimal architectures for deep neural networks. However, to our knowledge, they have not been applied as a weight optimizer for the Transformer model. While gradient descent has been the main paradigm for this task, we believe that GAs have advantages to bring to the table. In this paper, we will show that even though GAs are capable of fine-tuning Transformer encoders, their generalization ability is considerably poorer than that from Adam; however, on a closer look, GAs ability to exploit knowledge from 2 different pretraining datasets surpasses Adam’s ability to do so.
Anthology ID:
2024.insights-1.4
Volume:
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Shabnam Tafreshi, Arjun Akula, João Sedoc, Aleksandr Drozd, Anna Rogers, Anna Rumshisky
Venues:
insights | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25–33
Language:
URL:
https://aclanthology.org/2024.insights-1.4
DOI:
Bibkey:
Cite (ACL):
Vicente Ivan Sanchez Carmona, Shanshan Jiang, and Bin Dong. 2024. How Well Can a Genetic Algorithm Fine-tune Transformer Encoders? A First Approach. In Proceedings of the Fifth Workshop on Insights from Negative Results in NLP, pages 25–33, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
How Well Can a Genetic Algorithm Fine-tune Transformer Encoders? A First Approach (Sanchez Carmona et al., insights-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.insights-1.4.pdf