Amit Shukla


2025

pdf bib
Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals
Shruti Singh Baghel | Yash Pratap Singh Rathore | Anurag Pradhan | Sushovan Jena | Arnav Bhavsar | Amit Shukla | Pawan Goyal
Proceedings of the 1st Workshop on Multimodal Models for Low-Resource Contexts and Social Impact (MMLoSo 2025)

Large Vision-Language Models (VLMs) excel at understanding and generating video descriptions but their high memory, computation, and deployment demands hinder practical use particularly for blind and low-vision (BLV) users who depend on detailed, context-aware descriptions. To study the effect of model size on accessibility-focused description quality, we evaluate SmolVLM2 variants with 500M and 2.2B parameters across two diverse datasets: AVCaps (outdoor), and Charades (indoor). In this work, we introduce two novel evaluation frameworks specifically designed for BLV accessibility assessment: the Multi-Context BLV Framework evaluating spatial orientation, social interaction, action events, and ambience contexts; and the Navigational Assistance Framework focusing on mobility-critical information. Additionally, we conduct a systematic evaluation of four different prompt design strategies and deploy both models on a smartphone, evaluating FP32 and INT8 precision variants to assess real-world performance constraints on resource-limited mobile devices.

2024

pdf bib
Survey on Computational Approaches to Implicature
Kaveri Anuranjana | Srihitha Mallepally | Sriharshitha Mareddy | Amit Shukla | Radhika Mamidi
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

This paper explores the concept of solving implicature in Natural Language Processing (NLP), highlighting its significance in understanding indirect communication. Drawing on foundational theories by Austin, Searle, and Grice, we discuss how implicature extends beyond literal language to convey nuanced meanings. We review existing datasets, including the Pragmatic Understanding Benchmark (PUB), that assess models’ capabilities in recognizing and interpreting implicatures. Despite recent advances in large language models (LLMs), challenges remain in effectively processing implicature due to limitations in training data and the complexities of contextual interpretation. We propose future directions for research, including the enhancement of datasets and the integration of pragmatic reasoning tasks, to improve LLMs’ understanding of implicature and facilitate better human-computer interaction.