CodeAnubad at BLP-2025 Task 2: Efficient Bangla-to-Python Code Generation via Iterative LoRA Fine-Tuning of Gemma-2

Soumyajit Roy

CodeAnubad at BLP-2025 Task 2: Efficient Bangla-to-Python Code Generation via Iterative LoRA Fine-Tuning of Gemma-2

Abstract

This paper presents our submission for Task 2 of the Bangla Language Processing (BLP) Workshop, which focuses on generating Python code from Bangla programming prompts in a low-resource setting. We address this challenge by fine-tuning the gemma-2-9b instruction-tuned model using parameter-efficient fine-tuning (PEFT) with QLoRA. We propose an iterative self-improvement strategy that augments the extremely limited training data (74 examples) by reusing verified correct predictions from the development set, alongside LoRA rank experiments (8, 16, 32), observing a clear correlation between rank and accuracy, with rank 32 delivering the best results. Compared to translation-based and retrieval-augmented baselines, our approach achieves significantly higher accuracy, with a pass rate of 47% on the development set and 37% on the hidden test set. These results highlight the effectiveness of combining iterative data augmentation with rank optimisation for specialised, low-resource code generation tasks.

Anthology ID:: 2025.banglalp-1.53
Volume:: Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Naeemul Hassan, Enamul Hoque Prince, Mohiuddin Tasnim, Md Rashad Al Hasan Rony, Md Tahmid Rahman Rahman
Venues:: BanglaLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 556–560
Language:
URL:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.banglalp-1.53/
DOI:
Bibkey:
Cite (ACL):: Soumyajit Roy. 2025. CodeAnubad at BLP-2025 Task 2: Efficient Bangla-to-Python Code Generation via Iterative LoRA Fine-Tuning of Gemma-2. In Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025), pages 556–560, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):: CodeAnubad at BLP-2025 Task 2: Efficient Bangla-to-Python Code Generation via Iterative LoRA Fine-Tuning of Gemma-2 (Roy, BanglaLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.banglalp-1.53.pdf

PDF Cite Search Fix data