Prateek Puri
2026
Development and Benchmarking of a Blended Human-AI Qualitative Research Assistant
Joseph Matveyenko | James Liu | John David Parsons | Ryan Brown | Alina I. Palimaru | Vipul Gupta | Prateek Puri
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Joseph Matveyenko | James Liu | John David Parsons | Ryan Brown | Alina I. Palimaru | Vipul Gupta | Prateek Puri
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Qualitative research emphasizes constructing meaning through iterative engagement with textual data. Traditionally, this human-driven process requires navigating coder fatigue and interpretive drift, thus posing challenges when scaling analysis to larger, more complex datasets. Computational approaches to augment qualitative research have been met with skepticism, partly due to their inability to replicate the nuance, context-awareness, and sophistication of human analysis. LLMs, however, present new opportunities to automate aspects of qualitative analysis while upholding rigor and research quality. In this work, we present and benchmark Muse, an interactive qualitative research system that allows researchers to identify themes and annotate datasets, achieving an inter-rater reliability between Muse and humans of Cohen’s 𝜅 = 0.7 for well-specified codes.
2024
Retrieval Augmented Generation of Subjective Explanations for Socioeconomic Scenarios
Razvan-Gabriel Dumitru | Maria Alexeeva | Keith Alcock | Nargiza Ludgate | Cheonkam Jeong | Zara Fatima Abdurahaman | Prateek Puri | Brian Kirchhoff | Santadarshan Sadhu | Mihai Surdeanu
Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)
Razvan-Gabriel Dumitru | Maria Alexeeva | Keith Alcock | Nargiza Ludgate | Cheonkam Jeong | Zara Fatima Abdurahaman | Prateek Puri | Brian Kirchhoff | Santadarshan Sadhu | Mihai Surdeanu
Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)
We introduce a novel retrieval augmented generation approach that explicitly models causality and subjectivity. We use it to generate explanations for socioeconomic scenarios that capture beliefs of local populations. Through intrinsic and extrinsic evaluation, we show that our explanations, contextualized using causal and subjective information retrieved from local news sources, are rated higher than those produced by other large language models both in terms of mimicking the real population and the explanations quality. We also provide a discussion of the role subjectivity plays in evaluation of this natural language generation task.