Darshan Girish Deshpande


2025

pdf bib
Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning
Sky CH-Wang | Darshan Girish Deshpande | Smaranda Muresan | Anand Kannappan | Rebecca Qian
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce Browsing Lost Unformed Recollections, a tip-of-the-tongue known-item search and reasoning benchmark for general AI assistants. BLUR introduces a set of 573 real-world validated questions that demand searching and reasoning across multimodal and multilingual inputs, as well as proficient tool use, in order to excel on. Humans easily ace these questions (scoring on average 98%), while the best-performing system scores around 56%. To facilitate progress toward addressing this challenging and aspirational use case for general AI assistants, we release 350 questions through a public leaderboard, retain the answers to 250 of them, and have the rest as a private test set.

2024

pdf bib
Robust Text Classification: Analyzing Prototype-Based Networks
Zhivar Sourati | Darshan Girish Deshpande | Filip Ilievski | Kiril Gashteovski | Sascha Saralajew
Findings of the Association for Computational Linguistics: EMNLP 2024

Downstream applications often require text classification models to be accurate and robust. While the accuracy of state-of-the-art Language Models (LMs) approximates human performance, they often exhibit a drop in performance on real-world noisy data. This lack of robustness can be concerning, as even small perturbations in text, irrelevant to the target task, can cause classifiers to incorrectly change their predictions. A potential solution can be the family of Prototype-Based Networks (PBNs) that classifies examples based on their similarity to prototypical examples of a class (prototypes) and has been shown to be robust to noise for computer vision tasks. In this paper, we study whether the robustness properties of PBNs transfer to text classification tasks under both targeted and static adversarial attack settings. Our results show that PBNs, as a mere architectural variation of vanilla LMs, offer more robustness compared to vanilla LMs under both targeted and static settings. We showcase how PBNs’ interpretability can help us understand PBNs’ robustness properties. Finally, our ablation studies reveal the sensitivity of PBNs’ robustness to the strictness of clustering and the number of prototypes in the training phase, as tighter clustering and a low number of prototypes result in less robust PBNs.