S M Rafiuddin
Also published as: Rifat Rafiuddin
2026
Context-Conditioned Masked LoRA: Dynamic Rank Routing for Compute-Efficient Parameter-Efficient Fine-Tuning
Rifat Rafiuddin | Rafae Abdullah
Findings of the Association for Computational Linguistics: ACL 2026
Rifat Rafiuddin | Rafae Abdullah
Findings of the Association for Computational Linguistics: ACL 2026
Parameter-efficient fine-tuning methods such as LoRA reduce trainable parameters, but still apply dense low-rank updates per token, leaving adaptation compute largely fixed once rank is set. We propose Context-Conditioned Masked LoRA (CCM-LoRA), which learns a lightweight router that activates an input-dependent subset of LoRA rank directions, turning LoRA into dynamic rank routing and enabling contextual sparsity in fine-tuning and inference. CCM-LoRA is trained with a budget-constrained objective that targets an expected effective rank (or FLOPs) while regularizing routing to avoid degenerate always-on/off masks. Across public NLU and multilingual benchmarks, CCM-LoRA improves the accuracy–efficiency Pareto frontier versus static-rank LoRA and adaptive-rank baselines, matching or improving task performance at lower inference-time effective rank. We also provide a reproducible profiling protocol and analyses of rank usage, router overhead, and robustness under domain and language shift.
MaskLoRA: Low-Rank Subspace–Induced Token Masking for Efficient and Faithful Language Models
Rifat Rafiuddin
Findings of the Association for Computational Linguistics: EACL 2026
Rifat Rafiuddin
Findings of the Association for Computational Linguistics: EACL 2026
2025
A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models
Sadia Kamal | Lalu Prasad Yadav Prakash | S M Rafiuddin | Mohammed Rakib | Atriya Sen | Sagnik Ray Choudhury
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Sadia Kamal | Lalu Prasad Yadav Prakash | S M Rafiuddin | Mohammed Rakib | Atriya Sen | Sagnik Ray Choudhury
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
The Political Compass Test (PCT) and similar surveys are commonly used to assess political bias in auto-regressive LLMs. Our rigorous statistical experiments show that while changes to standard generation parameters have minimal effect on PCT scores, prompt phrasing and fine-tuning individually and together can significantly influence results. Interestingly, fine-tuning on politically rich vs. neutral datasets does not lead to different shifts in scores. We also generalize these findings to a similar popular test called 8 Values. Humans do not change their responses to questions when prompted differently (“answer this question” vs “state your opinion”), or after exposure to politically neutral text, such as mathematical formulae. But the fact that the models do so raises concerns about the validity of these tests for measuring model bias, and paves the way for deeper exploration into how political and social views are encoded in LLMs.
A Formal Analysis of Chain-of-Thought Prompting via Turing Reductions
S M Rafiuddin | Muntaha Nujat Khan
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
S M Rafiuddin | Muntaha Nujat Khan
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Chain-of-Thought (CoT) prompting has emerged as a powerful empirical technique for eliciting multi-step reasoning from large language models by decomposing complex tasks into sequential subprompts. However, the formal computational trade-offs between internal computation, query count, and space usage remain unexplored. We introduce the CoT-oracle Turing machine, a formal model in which each subprompt corresponds to an oracle query, and define three resource metrics: internal time T(n), query complexity Q(n), and prompt buffer space Sprompt(n). We prove that (T,Q)-bounded CoT machines exactly capture the class PO[Q(n)] of polynomial-time Turing reductions with Q(n) queries, derive upper bounds for P and NP-complete problems under linear and prefix-query budgets, and establish an Ω(n) query lower bound for SAT under P ≠ NP. Illustrative examples on integer factorization and SAT reconstruction, together with synthetic and LLM-based simulations, confirm our theoretical T–Q–S trade-off predictions. This framework provides principled guidelines for prompt design, noisy-oracle robustness, and cost-aware reasoning.
Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
S M Rafiuddin | Muntaha Nujat Khan
Findings of the Association for Computational Linguistics: EMNLP 2025
S M Rafiuddin | Muntaha Nujat Khan
Findings of the Association for Computational Linguistics: EMNLP 2025
Transformer attention scales quadratically with sequence length O(n2), limiting long-context use. We propose Adaptive Retention, a probabilistic, layer-wise token selection mechanism that learns which representations to keep under a strict global budget M. Retention is modeled with Bernoulli gates trained via a Hard-Concrete/variational relaxation and enforced with a simple top-M rule at inference, making the method differentiable and drop-in for standard encoders. Across classification, extractive QA, and long-document summarization, keeping only 30–50% of tokens preserves ≥ 95% of full-model performance while cutting peak memory by ∼ 35–45% and improving throughput by up to ∼ 1.8×. This architecture-agnostic approach delivers practical long-context efficiency without modifying base attention or task heads.