Grace Hui Yang

2026

Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents
Xubo Lin | Zezhi Deng | Shihao Wang | Grace Hui Yang | Yang Deng
Findings of the Association for Computational Linguistics: ACL 2026

Most existing dialogue systems are user-driven, primarily designed to fulfill user requests. However, in many critical real-world scenarios, a conversational agent must proactively extract information to achieve its own objectives rather than merely respond. To address this gap, we introduce Inquisitive Conversational Agents (ICAs) and develop an ICA specifically tailored to U.S. Supreme Court oral arguments. We propose a Dual Hierarchical Reinforcement Learning framework featuring two cooperating RL agents, each with its own policy, to coordinate strategic dialogue management and fine-grained utterance generation. By learning when and how to ask probing questions, the agent emulates judicial questioning patterns and systematically uncovers crucial information to fulfill its legal objectives. Evaluations on a U.S. Supreme Court dataset show our method outperforms single-agent RL baselines in multiple metrics. Although specialized to a single legal domain, it represents an important first step toward broader high-stakes, domain-specific applications. We attached a part of the code as supplementary material. All code will be released upon publication for reproducibility.

pdf bib abs

YIELD: A Large-Scale Dataset and Evaluation Framework for Information Elicitation Agents
Victor De Lima | Grace Hui Yang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most conversational agents (CAs) are designed to satisfy user needs through user-driven interactions. However, many real-world settings, such as academic interviewing, judicial proceedings, and journalistic investigations, involve broader institutional decision-making processes and require agents that can elicit information from users. In this paper, we introduce Information Elicitation Agents (IEAs) in which the agent’s goal is to elicit information from users to support the agent’s institutional or task-oriented objectives. To enable systematic research on this setting, we present YIELD, a 26M-token dataset of 2,281 ethically sourced, human-to-human dialogues. Moreover, we formalize information elicitation as a finite-horizon POMDP and propose novel metrics tailored to IEAs. Pilot experiments on multiple foundation LLMs show that training on YIELD improves their alignment with real elicitation behavior and findings are corroborated by human evaluation. We release YIELD under CC BY 4.0. The dataset, project code, evaluation tools, and fine-tuned model adapters are available at: https://github.com/infosenselab/yield.

2021

pdf bib

High-Quality Dialogue Diversification by Intermittent Short Extension Ensembles
Zhiwen Tang | Hrishikesh Kulkarni | Grace Hui Yang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib abs

More Diverse Dialogue Datasets via Diversity-Informed Data Collection
Katherine Stasaski | Grace Hui Yang | Marti A. Hearst
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Automated generation of conversational dialogue using modern neural architectures has made notable advances. However, these models are known to have a drawback of often producing uninteresting, predictable responses; this is known as the diversity problem. We introduce a new strategy to address this problem, called Diversity-Informed Data Collection. Unlike prior approaches, which modify model architectures to solve the problem, this method uses dynamically computed corpus-level statistics to determine which conversational participants to collect data from. Diversity-Informed Data Collection produces significantly more diverse data than baseline data collection methods, and better results on two downstream tasks: emotion classification and dialogue generation. This method is generalizable and can be used with other corpus-level metrics.

Co-authors

Venues

ACL2
Findings2

Fix author