Rethinking Full Finetuning from Pretraining Checkpoints in Active Learning for African Languages

Bonaventure F. P. Dossou, Ines Arous, Jackie CK Cheung


Abstract
Active learning (AL) aims to reduce annotation effort by iteratively selecting the most informative samples for labeling. The dominant strategy in AL involves fully finetuning the model on all acquired data after each round, which is computationally expensive in multilingual and low-resource settings. This paper investigates continual finetuning (CF), an alternative update strategy where the model is updated only on newly acquired samples at each round. We evaluate CF against full finetuning (FA) across 28 African languages using MasakhaNEWS and SIB-200. Our analysis reveals three key findings. First, CF matches or outperforms FA for languages included in the model’s pretraining, achieving up to 35% reductions in GPU memory, FLOPs, and training time. Second, CF performs comparably even for languages not seen during pretraining when they are typologically similar to those that were. Third, CF’s effectiveness depends critically on uncertainty-based acquisition; without it, performance deteriorates significantly. While FA remains preferable for some low-resource languages, the overall results establish CF as a robust, cost-efficient alternative for active learning in multilingual NLP. These findings motivate developing hybrid AL strategies that adapt fine-tuning behavior based on pretraining coverage, language typology, and acquisition dynamics.
Anthology ID:
2025.acl-srw.5
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Jin Zhao, Mingyang Wang, Zhu Liu
Venues:
ACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
64–78
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.acl-srw.5/
DOI:
Bibkey:
Cite (ACL):
Bonaventure F. P. Dossou, Ines Arous, and Jackie CK Cheung. 2025. Rethinking Full Finetuning from Pretraining Checkpoints in Active Learning for African Languages. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 64–78, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Rethinking Full Finetuning from Pretraining Checkpoints in Active Learning for African Languages (Dossou et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.acl-srw.5.pdf