Jay Shah


2025

pdf bib
Capturing Patients’ Lived Experiences with Chronic Pain through Motivational Interviewing and Information Extraction
Hadeel R A Elyazori | Rusul Abdulrazzaq | Hana Al Shawi | Isaac Amouzou | Patrick King | Syleah Manns | Mahdia Popal | Zarna Patel | Secili Destefano | Jay Shah | Naomi Gerber | Siddhartha Sikdar | Seiyon Lee | Samuel Acuna | Kevin Lybarger
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)

Chronic pain affects millions, yet traditional assessments often fail to capture patients’ lived experiences comprehensively. In this study, we used a Motivational Interviewing framework to conduct semi-structured interviews with eleven adults experiencing chronic pain and then applied Natural Language Processing (NLP) to their narratives. We developed an annotation schema that integrates the International Classification of Functioning, Disability, and Health (ICF) with Aspect-Based Sentiment Analysis (ABSA) to convert unstructured narratives into structured representations of key patient experience dimensions. Furthermore, we evaluated whether Large Language Models (LLMs) can automatically extract information using this schema. Our findings advance scalable, patient-centered approaches to chronic pain assessment, paving the way for more effective, data-driven management strategies.

2022

pdf bib
The Bull and the Bear: Summarizing Stock Market Discussions
Ayush Kumar | Dhyey Jani | Jay Shah | Devanshu Thakar | Varun Jain | Mayank Singh
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Stock market investors debate and heavily discuss stock ideas, investing strategies, news and market movements on social media platforms. The discussions are significantly longer in length and require extensive domain expertise for understanding. In this paper, we curate such discussions and construct a first-of-its-kind of abstractive summarization dataset. Our curated dataset consists of 7888 Reddit posts and manually constructed summaries for 400 posts. We robustly evaluate the summaries and conduct experiments on SOTA summarization tools to showcase their limitations. We plan to make the dataset publicly available. The sample dataset is available here: https://dhyeyjani.github.io/RSMC