2025
pdf
bib
abs
SQLGenie: A Practical LLM based System for Reliable and Efficient SQL Generation
Pushpendu Ghosh
|
Aryan Jain
|
Promod Yenigalla
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Large Language Models (LLMs) enable natural language to SQL conversion, allowing users to query databases without SQL expertise. However, generating accurate, efficient queries is challenging due to ambiguous intent, domain knowledge requirements, and database constraints. Extensive reasoning improves SQL quality but increases computational costs and latency. We propose SQLGenie, a practical system for reliable SQL generation. It consists of three components: (1) Table Onboarder, which analyzes new tables, optimizes indexing, partitions data, identifies foreign key relationships, and stores schema details for SQL generation; (2) SQL Generator, an LLM-based system producing accurate SQL; and (3) Feedback Augmentation, which filters correct query-SQL pairs, leverages multiple LLM agents for complex SQL, and stores verified examples. SQLGenie achieves state-of-the-art performance on public benchmarks (92.8% execution accuracy on WikiSQL, 82.1% of Spider, 73.8% on BIRD) and internal datasets, surpassing the best single-LLM baseline by 21.5% and the strongest pipeline competitor by 5.3%. Its hybrid variant optimally balances accuracy and efficiency, reducing generation time by 64% compared to traditional multi-LLM approaches while maintaining competitive accuracy.
pdf
bib
abs
<SYNTACT>: Structuring Your Natural Language SOPs into Tailored Ambiguity-Resolved Code Templates
Sachin Kumar Giroh
|
Pushpendu Ghosh
|
Aryan Jain
|
Harshal Giridhari Paunikar
|
Aditi Rastogi
|
Promod Yenigalla
|
Anish Nediyanchath
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
This paper introduces <SYNTACT>, a three-stage multi agent LLM framework designed to transform unstructured and ambiguous Standard Operating Procedure (SOP) into a structured plan and an executable code template. Unstructured SOPs—common across industries such as finance, retail, and logistics—frequently suffer from ambiguity, missing information, and inconsistency, all of which hinder automation. SYNTACT addresses this through: (1) a Clarifier module that disambiguate the SOP using large language models, internal knowledge base (RAG) and human-in-the-loop , (2) a Planner that converts refined natural language instructions into a structured plan of hierarchical task flows through function (API) tagging, conditional branches and human-in-the-loop check-points, and (3) an Implementor that generates executable code fragments or pseudocode templates. We evaluate SYNTACT on real-world SOPs and synthetic variants, demonstrating an 88.4% end-to-end accuracy and a significant reduction in inconsistency compared to leading LLM baselines. Ablation studies highlight the necessity of each component, with performance dropping notably when modules are removed.Our findings show that structured multi-agent pipelines like SYNTACT can meaningfully improve consistency, reduce manual effort, and accelerate automation at scale.
pdf
bib
abs
PARSE: LLM Driven Schema Optimization for Reliable Entity Extraction
Anubhav Shrimal
|
Aryan Jain
|
Soumyajit Chowdhury
|
Promod Yenigalla
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Structured information extraction from unstructured text is critical for emerging Software 3.0 systems where LLM agents autonomously interact with APIs and tools. Recent approaches apply large language models directly to extraction tasks using existing JSON schemas, often with constraint decoding or reinforcement learning approaches to ensure syntactic validity, but treat JSON schemas as static contracts designed for human developers, leading to suboptimal extraction performance, frequent hallucinations, and unreliable agent behavior when schemas contain ambiguous or incomplete specifications. We recognize that JSON schemas themselves are a form of natural language understanding contract that encodes rules, relationships, and expectations about data structure contracts that LLMs should be able to both interpret and systematically improve. Consequently, we develop PARSE (Parameter Automated Refinement and Schema Extraction), a novel system with two synergistic components: ARCHITECT, which autonomously optimizes JSON schemas for LLM consumption while maintaining backward compatibility through RELAY (an integrated code generation system), and SCOPE, which implements reflection-based extraction with combined static and LLM-based guardrails. We evaluate PARSE qualitatively and quantitatively on three datasets including Schema-Guided Dialogue (SGD), Structured Web Data Extraction (SWDE), and internal retail conversation data, and find that it achieves up to 64.7% improvement in extraction accuracy on SWDE with combined framework improvements reaching 10% across models, while reducing extraction errors by 92% within the first retry and and maintaining practical latency.
2023
pdf
bib
abs
Too much of product information : Don’t worry, let’s look for evidence!
Aryan Jain
|
Jitenkumar Rana
|
Chetan Aggarwal
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Product question answering (PQA) aims to provide an instant response to customer questions posted on shopping message boards, social media, brand websites and retail stores. In this paper, we propose a distantly supervised solution to answer customer questions by using product information. Auto-answering questions using product information poses two main challenges:(i) labelled data is not readily available (ii)lengthy product information requires attending to various parts of the text to answer the question. To this end, we first propose a novel distant supervision based NLI model to prepare training data without any manual efforts. To deal with lengthy context, we factorize answer generation into two sub-problems. First, given product information, model extracts evidence spans relevant to question. Then, model leverages evidence spans to generate answer. Further, we propose two novelties in fine-tuning approach: (i) First, we jointly fine-tune model for both the tasks in end-to-end manner and showcase that it outperforms standard multi-task fine-tuning. (ii) Next, we introduce an auxiliary contrastive loss for evidence extraction. We show that combination of these two ideas achieves an absolute improvement of 6% in accuracy (human evaluation) over baselines.