2025
pdf
bib
abs
<SYNTACT>: Structuring Your Natural Language SOPs into Tailored Ambiguity-Resolved Code Templates
Sachin Kumar Giroh
|
Pushpendu Ghosh
|
Aryan Jain
|
Harshal Giridhari Paunikar
|
Aditi Rastogi
|
Promod Yenigalla
|
Anish Nediyanchath
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
This paper introduces <SYNTACT>, a three-stage multi agent LLM framework designed to transform unstructured and ambiguous Standard Operating Procedure (SOP) into a structured plan and an executable code template. Unstructured SOPs—common across industries such as finance, retail, and logistics—frequently suffer from ambiguity, missing information, and inconsistency, all of which hinder automation. SYNTACT addresses this through: (1) a Clarifier module that disambiguate the SOP using large language models, internal knowledge base (RAG) and human-in-the-loop , (2) a Planner that converts refined natural language instructions into a structured plan of hierarchical task flows through function (API) tagging, conditional branches and human-in-the-loop check-points, and (3) an Implementor that generates executable code fragments or pseudocode templates. We evaluate SYNTACT on real-world SOPs and synthetic variants, demonstrating an 88.4% end-to-end accuracy and a significant reduction in inconsistency compared to leading LLM baselines. Ablation studies highlight the necessity of each component, with performance dropping notably when modules are removed.Our findings show that structured multi-agent pipelines like SYNTACT can meaningfully improve consistency, reduce manual effort, and accelerate automation at scale.
2024
pdf
bib
abs
MARCO: Multi-Agent Real-time Chat Orchestration
Anubhav Shrimal
|
Stanley Kanagaraj
|
Kriti Biswas
|
Swarnalatha Raghuraman
|
Anish Nediyanchath
|
Yi Zhang
|
Promod Yenigalla
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Large language model advancements have enabled the development of multi-agent frameworks to tackle complex, real-world problems such as to automate workflows that require interactions with diverse tools, reasoning, and human collaboration. We present MARCO, a Multi-Agent Real-time Chat Orchestration framework for automating workflows using LLMs. MARCO addresses key challenges in utilizing LLMs for complex, multi-step task execution in a production environment. It incorporates robust guardrails to steer LLM behavior, validate outputs, and recover from errors that stem from inconsistent output formatting, function and parameter hallucination, and lack of domain knowledge. Through extensive experiments we demonstrate MARCO’s superior performance with 94.48% and 92.74% accuracy on task execution for Digital Restaurant Service Platform conversations and Retail conversations datasets respectively along with 44.91% improved latency and 33.71% cost reduction in a production setting. We also report effects of guardrails in performance gain along with comparisons of various LLM models, both open-source and proprietary. The modular and generic design of MARCO allows it to be adapted for automating workflows across domains and to execute complex tasks through multi-turn interactions.
2020
pdf
bib
abs
Semantic Slot Prediction on low corpus data using finite user defined list
Bharatram Natarajan
|
Dharani Simma
|
Chirag Singh
|
Anish Nediyanchath
|
Sreoshi Sengupta
Proceedings of the 17th International Conference on Natural Language Processing (ICON)
Semantic slot prediction is one of the important task for natural language understanding (NLU). They depend on the quality and quantity of the human crafted training data, which affects model generalization. With the advent of voice assistants exposing AI platforms to third party developers, training data quality and quantity matters for any machine learning algorithm to learn and generalize properly.AI platforms provides provision to add custom external plist defined by the developers for the training data. Hence we are exploring dataset, called LowCorpusSlotData, containing low corpus training data with larger number of slots and significant test data. We also use external plist for the above dataset to aid in slot identification. We experimented using state of the art architectures like Bi-directional Encoder Representations from Transformers (BERT) with variants and Bi-directional Encoder with Custom Decoder. To address the low corpus problem, we propose a pipeline approach where we extract candidate slot information using the external plist extractor module and feed as input along with utterance.
2019
pdf
bib
abs
Robust Deep Learning Based Sentiment Classification of Code-Mixed Text
Siddhartha Mukherjee
|
Vinuthkumar Prasan
|
Anish Nediyanchath
|
Manan Shah
|
Nikhil Kumar
Proceedings of the 16th International Conference on Natural Language Processing
India is one of unique countries in the world that has the legacy of diversity of languages. Most of these languages are influenced by English. This causes a large presence of code-mixed text in Social Media. Enormous presence of this code-mixed text provides an important research area for Natural Language Processing (NLP). This paper proposes a novel Attention based deep learning technique for Sentiment Classification on Code-Mixed Text (ACCMT) of Hindi-English. The proposed architecture uses fusion of character and word features. Non availability of suitable Word Embedding to represent these Code-Mixed texts is another important hurdle for this league of NLP tasks. This paper also proposes a novel technique for preparing Word Embedding of Code-Mixed text. This embedding is prepared with two separately trained word-embedding on Romanized Hindi and English respectively. This embedding is further used in the proposed deep learning based architecture for robust classification. The Proposed technique achieves 71.97% accuracy, which exceeds the baseline accuracy.