Arnav Bhavsar


2025

pdf bib
INDRA: Iterative Difficulty Refinement Attention for MCQ Difficulty Estimation for Indic Languages
Manikandan Ravikiran | Rohit Saluja | Arnav Bhavsar
Proceedings of the 1st Workshop on Benchmarks, Harmonization, Annotation, and Standardization for Human-Centric AI in Indian Languages (BHASHA 2025)

Estimating the difficulty of multiple-choice questions (MCQs) is central to adaptive testing and learner modeling. We introduce INDRA (Iterative Difficulty Refinement Attention), a novel attention mechanism that unifies psychometric priors with neural refinement for Indic MCQ difficulty estimation. INDRA incorporates three key innovations: (i) IRT-informed initialization, which assigns token-level discrimination and difficulty scores to embed psychometric interpretability; (ii) entropy-driven iterative refinement, which progressively sharpens attention to mimic the human process of distractor elimination; and (iii) Indic Aware Graph Coupling, which propagates plausibility across morphologically and semantically related tokens, a critical feature for Indic languages. Experiments on TEEMIL-H and TEEMIL-K datasets show that INDRA achieves consistent improvements, with absolute gains of up to +1.02 F1 and +1.68 F1 over state-of-the-art, while demonstrating through ablation studies that psychometric priors, entropy refinement, and graph coupling contribute complementary gains to accuracy and robustness.

pdf bib
TEEMIL : Towards Educational MCQ Difficulty Estimation in Indic Languages
Manikandan Ravikiran | Siddharth Vohra | Rajat Verma | Rohit Saluja | Arnav Bhavsar
Proceedings of the 31st International Conference on Computational Linguistics

Difficulty estimation of multiple-choice questions (MCQs) is crucial for creating effective educational assessments, yet remains underexplored in Indic languages like Hindi and Kannada due to the lack of comprehensive datasets. This paper addresses this gap by introducing two datasets, TEEMIL-H and TEEMIL-K, containing 4689 and 4215 MCQs, respectively, with manually annotated difficulty labels. We benchmark these datasets using state-of-the-art multilingual models and conduct ablation studies to analyze the effect of context, the impact of options, and the presence of the None of the Above (NOTA) option on difficulty estimation. Our findings establish baselines for difficulty estimation in Hindi and Kannada, offering valuable insights into improving model performance and guiding future research in MCQ difficulty estimation .

pdf bib
Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals
Shruti Singh Baghel | Yash Pratap Singh Rathore | Anurag Pradhan | Sushovan Jena | Arnav Bhavsar | Amit Shukla | Pawan Goyal
Proceedings of the 1st Workshop on Multimodal Models for Low-Resource Contexts and Social Impact (MMLoSo 2025)

Large Vision-Language Models (VLMs) excel at understanding and generating video descriptions but their high memory, computation, and deployment demands hinder practical use particularly for blind and low-vision (BLV) users who depend on detailed, context-aware descriptions. To study the effect of model size on accessibility-focused description quality, we evaluate SmolVLM2 variants with 500M and 2.2B parameters across two diverse datasets: AVCaps (outdoor), and Charades (indoor). In this work, we introduce two novel evaluation frameworks specifically designed for BLV accessibility assessment: the Multi-Context BLV Framework evaluating spatial orientation, social interaction, action events, and ambience contexts; and the Navigational Assistance Framework focusing on mobility-critical information. Additionally, we conduct a systematic evaluation of four different prompt design strategies and deploy both models on a smartphone, evaluating FP32 and INT8 precision variants to assess real-world performance constraints on resource-limited mobile devices.