Rattandeep Singh
2025
Program Synthesis Dialog Agents for Interactive Decision-Making
Matthew Toles
|
Nikhil Balwani
|
Rattandeep Singh
|
Valentina Giulia Sartori Rodriguez
|
Zhou Yu
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Many real-world eligibility problems, ranging from medical diagnosis to tax planning, can be mapped to decision problems expressed in natural language, wherein a model must make a binary choice based on the features of the user. Large-scale domains such as legal codes or frequently updated funding opportunities render human annotation (e.g., web forms or decision trees) impractical, suggesting a need for agents that can automatically assist in decision-making. Since relevant information is often only known to the user, it is important that these agents can ask the right questions. To evaluate this task, we propose BeNYfits, a new benchmark for determining user eligibility for multiple overlapping social benefits opportunities through interactive decision-making. Our experiments show that current language models struggle with frequent hallucinations, with GPT-4o scoring only 35.7 F1 using a ReAct-style chain-of-thought. We therefore introduce ProADA, a novel approach that uses program synthesis to assist in decision-making by mapping dialog planning to a code generation problem and using gaps in structured data to determine the best next action. Our agent, ProADA, improves the F1 score to 56.2 while using nearly the same number of dialog turns.