Akshay Paruchuri


2025

pdf bib
“What’s Up, Doc?”: Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets
Akshay Paruchuri | Maryam Aziz | Rohit Vartak | Ayman Ali | Best Uchehara | Xin Liu | Ishan Chatterjee | Monica Agrawal
Findings of the Association for Computational Linguistics: EMNLP 2025

People are increasingly seeking healthcare information from large language models (LLMs) via interactive chatbots, yet the nature and inherent risks of these conversations remain largely unexplored. In this paper, we filter large-scale conversational AI datasets to achieve HealthChat-11K, a curated dataset of 11K real-world conversations composed of 25K user messages. We use HealthChat-11K and a clinician-driven taxonomy for how users interact with LLMs when seeking healthcare information in order to systematically study user interactions across 21 distinct health specialties. Our analysis reveals insights into the nature of how and why users seek health information, such as common interactions, instances of incomplete context, affective behaviors, and interactions (e.g., leading questions) that can induce sycophancy, underscoring the need for improvements in the healthcare support capabilities of LLMs deployed as conversational AI. We release code and artifacts to retrieve our analyses and combine them into a curated dataset for further research.

2024

pdf bib
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning
Akshay Paruchuri | Jake Garrison | Shun Liao | John B Hernandez | Jacob Sunshine | Tim Althoff | Xin Liu | Daniel McDuff
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle. An important but rarely evaluated form of reasoning is understanding probability distributions. In this paper, we focus on evaluating the probabilistic reasoning capabilities of LMs using idealized and real-world statistical distributions. We perform a systematic evaluation of state-of-the-art LMs on three tasks: estimating percentiles, drawing samples, and calculating probabilities. We evaluate three ways to provide context to LMs 1) anchoring examples from within a distribution or family of distributions, 2) real-world context, 3) summary statistics on which to base a Normal approximation. Models can make inferences about distributions, and can be further aided by the incorporation of real-world context, example shots and simplified assumptions, even if these assumptions are incorrect or misspecified. To conduct this work, we developed a comprehensive benchmark distribution dataset with associated question-answer pairs that we have released publicly.