Troopers at BLP-2025 Task 2: Reward-Selective Fine-Tuning based Code Generation Approach for Bangla Prompts

Musa Tur Farazi, Nufayer Jahan Reza


Abstract
We present a formally grounded description of a reward-selective fine-tuning (RSFT) pipeline for code generation from Bangla natural-language prompts. The implemented system mines candidate programs via temperature and nucleus sampling, executes candidates in a sandbox and retains programs that pass all unit tests, performs supervised fine-tuning (SFT) on winners using parameter-efficient Low rank adaptation (LoRA) adapters, and augments robustness through fuzzed asserts. We specify the exact objectives and estimators used, provide a Bangla-aware preprocessing recipe, prove simple properties of the sampling budget, and report an ablation showing the effect of inference sample budget K on accuracy. We also include a threat model for safe execution. Our codes are available on GitHub.
Anthology ID:
2025.banglalp-1.54
Volume:
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Naeemul Hassan, Enamul Hoque Prince, Mohiuddin Tasnim, Md Rashad Al Hasan Rony, Md Tahmid Rahman Rahman
Venues:
BanglaLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
561–565
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.banglalp-1.54/
DOI:
Bibkey:
Cite (ACL):
Musa Tur Farazi and Nufayer Jahan Reza. 2025. Troopers at BLP-2025 Task 2: Reward-Selective Fine-Tuning based Code Generation Approach for Bangla Prompts. In Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025), pages 561–565, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
Troopers at BLP-2025 Task 2: Reward-Selective Fine-Tuning based Code Generation Approach for Bangla Prompts (Farazi & Reza, BanglaLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.banglalp-1.54.pdf