Chirag Shah
2026
PROPER Agents: Proactivity Driven Personalized Agents for Advancing Knowledge Gap Navigation
Kirandeep Kaur | Vinayak Gupta | Aditya Gupta | Chirag Shah
Findings of the Association for Computational Linguistics: ACL 2026
Kirandeep Kaur | Vinayak Gupta | Aditya Gupta | Chirag Shah
Findings of the Association for Computational Linguistics: ACL 2026
Current approaches to proactive assistance move beyond the ask-and-respond paradigm by anticipating user needs. In practice, they either burden users with clarifying questions or rely on context-based extrapolation, often leading to unnecessary or mistimed interventions. Such systems lack explicit mechanisms to model users’ knowledge gaps, resulting in incomplete or suboptimal task outcomes. To address this, we propose PROPER, a framework that explicitly models user-specific knowledge gaps in a controlled manner. Central to our approach is the notion of dimensions: structured, task-relevant factors that define the considerations required for effective task completion. Given a user query, the DGA (Dimension Generating Agent) identifies explicit dimensions (from the user’s query) and generates a set of candidate implicit dimensions capturing unarticulated aspects of the task. The RGA (Response Generating Agent) integrates both explicit and implicit dimensions selectively to produce personalized, context-aware, and proactively informative responses. We evaluate PROPER across multiple domains using a structured, gap-aware rubric that measures coverage, initiative appropriateness, and intent alignment. PROPER improves on quality scores and win rates across all domains, achieving up to 84% gains in single-turn evaluation and consistent dominance in multiturn interactions. All code for PROPER is available at: https://github.com/i-kiran/ProPer-Agent.
ClaimDB: A Fact Verification Benchmark over Large Structured Data
Michael Theologitis | Preetam Prabhu Srikar Dammu | Chirag Shah | Dan Suciu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Michael Theologitis | Preetam Prabhu Srikar Dammu | Chirag Shah | Dan Suciu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Real-world fact-checking often involves verifying claims grounded in structured data at scale. Despite substantial progress in fact-verification benchmarks, this setting remains largely underexplored. In this work, we introduce ClaimDB, a fact-verification benchmark where the evidence for claims is derived from compositions of millions of records and multiple tables. ClaimDB consists of 80 unique real-life databases covering a wide range of domains, from governance and healthcare to media, education and the natural sciences. At this scale, verification approaches that rely on "reading" the evidence break down, forcing a timely shift toward reasoning in executable programs. We conduct extensive experiments with 30 state-of-the-art proprietary and open-source (below 70B) LLMs and find that more than half score below 55% accuracy. Our analysis also reveals that both closed- and open-source models struggle with abstention – the ability to admit that there is no evidence to decide – raising doubts about their reliability in high-stakes data analysis tasks. We release the benchmark, code, and the LLM leaderboard at https://claimdb.github.io.
2024
S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
Sarkar Snigdha Sarathi Das | Chirag Shah | Mengting Wan | Jennifer Neville | Longqi Yang | Reid Andersen | Georg Buscher | Tara Safavi
Findings of the Association for Computational Linguistics: ACL 2024
Sarkar Snigdha Sarathi Das | Chirag Shah | Mengting Wan | Jennifer Neville | Longqi Yang | Reid Andersen | Georg Buscher | Tara Safavi
Findings of the Association for Computational Linguistics: ACL 2024
Traditional Dialogue State Tracking (DST) has focused on tracking preferences and intents in conversations centered around specific tasks (e.g. booking services). These conventional systems assume a relatively restricted conversation flow in which each turn gradually offers new information. However, advancements in Large Language Models (LLMs) have ushered in more versatile open-domain chat systems in which extended dialogue sessions encompassing numerous tasks and topics are common—in turn requiring new conversational tracking tools in order to successfully orchestrate such systems. Addressing these challenges, we introduce a novel approach combining dialogue segmentation and state tracking within open-domain dialogues, tailored for zero-shot applications appropriate to a true open-domain dialogue system. Our proposed method S3-DST employs a unique structured prompting technique and *Pre-Analytical Recollection*, a novel grounding mechanism we designed for improving long context tracking. Tested on proprietary anonymized open-domain dialogue datasets as well as publicly available DST and segmentation datasets, S3-DST consistently outperforms the state-of-the-art, showcasing its effectiveness and adaptability state tracking in the next wave of LLM-based chat systems. We also release S3-DST annotations with GPT-4 on a curated subset of LMSYS-Chat-1M to be used as a testbed to fuel research in this direction.
ClaimVer: Explainable Claim-Level Verification and Evidence Attribution of Text Through Knowledge Graphs
Preetam Prabhu Srikar Dammu | Himanshu Naidu | Mouly Dewan | YoungMin Kim | Tanya Roosta | Aman Chadha | Chirag Shah
Findings of the Association for Computational Linguistics: EMNLP 2024
Preetam Prabhu Srikar Dammu | Himanshu Naidu | Mouly Dewan | YoungMin Kim | Tanya Roosta | Aman Chadha | Chirag Shah
Findings of the Association for Computational Linguistics: EMNLP 2024
In the midst of widespread misinformation and disinformation through social media and the proliferation of AI-generated texts, it has become increasingly difficult for people to validate and trust information they encounter. Many fact-checking approaches and tools have been developed, but they often lack appropriate explainability or granularity to be useful in various contexts. A text validation method that is easy to use, accessible, and can perform fine-grained evidence attribution has become crucial. More importantly, building user trust in such a method requires presenting the rationale behind each prediction, as research shows this significantly influences people’s belief in automated systems. Localizing and bringing users’ attention to the specific problematic content is also paramount, instead of providing simple blanket labels. In this paper, we present ClaimVer, a human-centric framework tailored to meet users’ informational and verification needs by generating rich annotations and thereby reducing cognitive load. Designed to deliver comprehensive evaluations of texts, it highlights each claim, verifies it against a trusted knowledge graph (KG), presents the evidence, and provides succinct, clear explanations for each claim prediction. Finally, our framework introduces an attribution score, enhancing applicability across a wide range of downstream tasks.
2009
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Ulrich Germann | Chirag Shah | Svetlana Stoyanchev | Carolyn Penstein Rosé | Anoop Sarkar
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Ulrich Germann | Chirag Shah | Svetlana Stoyanchev | Carolyn Penstein Rosé | Anoop Sarkar
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium