Akm Mahbubur Rahman

Also published as: AKM Mahbubur Rahman

2026

PRiSM: Partial Ranking via Inter-layer Semantic Measurement for Efficient Fine-tuning of Language Models
Aldrin Kabya Biswas | Md Fahim | Md. Ashraful Amin | Amin Ahsan Ali | AKM Mahbubur Rahman
Proceedings of the Fifteenth Language Resources and Evaluation Conference

The growing scale of pre-trained language models poses a challenge in fine-tuning for downstream tasks, especially in resource-constrained settings. Recent studies highlight that not all layers in transformer-based language models contribute equally to downstream task performance, giving rise to various partial fine-tuning strategies. However, current methods often introduce significant training overhead or rely on simple heuristics that yield suboptimal performance and poor generalization. We propose PRiSM (Partial Ranking via inter-layer Semantic Measurement), a training-free approach for layer-wise partial fine-tuning that leverages the cosine similarity between pre-trained aggregate token representations across layers to identify inter-layer relationships. comprises two stages: (i) scoring layers based on their relevance to the task via a single forward pass, and (ii) fine-tuning a subset of block-wise highest-scoring layers, while keeping others frozen. We conduct experiments on 15 diverse NLP datasets, including single-sentence and sentence-pair classification tasks. Our method achieves competitive performance compared to full fine-tuning, with an average training speedup of 1.5× and a reduction of trainable parameters by 75%, and outperforms all the comparative baselines. Additionally, our approach does not cause any notable drop in performance when the domain is changed for the evaluation tasks, demonstrating robust cross-domain generalizability.

2025

pdf bib abs

BD at BEA 2025 Shared Task: MPNet Ensembles for Pedagogical Mistake Identification and Localization in AI Tutor Responses
Shadman Rohan | Ishita Sur Apan | Muhtasim Ibteda Shochcho | Md Fahim | Mohammad Ashfaq Ur Rahman | AKM Mahbubur Rahman | Amin Ahsan Ali
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

We present Team BD’s submission to the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors, under Track 1 (Mistake Identification) and Track 2 (Mistake Location). Both tracks involve three-class classification of tutor responses in educational dialogues – determining if a tutor correctly recognizes a student’s mistake (Track 1) and whether the tutor pinpoints the mistake’s location (Track 2). Our system is built on MPNet, a Transformer-based language modelthat combines BERT and XLNet’s pre-training advantages. We fine-tuned MPNet on the task data using a class-weighted cross-entropy loss to handle class imbalance, and leveraged grouped cross-validation (10 folds) to maximize the use of limited data while avoiding dialogue overlap between training and validation. We then performed a hard-voting ensemble of the best models from each fold, which improves robustness and generalization by combining multiple classifiers. Ourapproach achieved strong results on both tracks, with exact-match macro-F1 scores of approximately 0.7110 for Mistake Identification and 0.5543 for Mistake Location on the official test set. We include comprehensive analysis of our system’s performance, including confusion matrices and t-SNE visualizations to interpret classifier behavior, as well as a taxonomy of common errors with examples. We hope our ensemble-based approach and findings provide useful insights for designing reliable tutor response evaluation systems in educational dialogue settings.