Mahnoosh Mehrabani


2025

pdf bib
Chain-of-Interactions: Multi-step Iterative ICL Framework for Abstractive Task-Oriented Dialogue Summarization of Conversational AI Interactions
Jason S Lucas | Ali Al Lawati | Mahjabin Nahar | John Chen | Mahnoosh Mehrabani
Findings of the Association for Computational Linguistics: EMNLP 2025

Large Language Models (LLMs) have introduced paradigm-shifting approaches in natural language processing. Yet, their transformative in-context learning (ICL) capabilities remain underutilized, especially in customer service dialogue summarization—a domain plagued by generative hallucinations, detail omission, and inconsistencies. We present Chain-of-Interactions (CoI), a novel single-instance, multi-step framework that orchestrates information extraction, self-correction, and evaluation through sequential interactive generation chains. By strategically leveraging LLMs’ ICL capabilities through precisely engineered prompts, CoI dramatically enhances abstractive task-oriented dialogue summarization (ATODS) quality and usefulness. Our comprehensive evaluation on real-world and benchmark human-agent interaction datasets demonstrates CoI’s effectiveness through rigorous testing across 11 models and 7 prompting approaches, with 9 standard automatic evaluation metrics, 3 LLM-based evaluations, and human studies involving 480 evaluators across 9 quality dimensions. Results reveal CoI’s decisive superiority, outperforming all single-step approaches and achieving 6× better entity preservation, 49% higher quality scores, and 322% improvement in accuracy compared to state-of-the-art multi-step Chain-of-Density (CoD). This research addresses critical gaps in task-oriented dialogue summarization for customer service applications and establishes new standards for harnessing LLMs’ reasoning capabilities in practical, industry-relevant contexts.

2023

pdf bib
Combining Pre trained Speech and Text Encoders for Continuous Spoken Language Processing
Karan Singla | Mahnoosh Mehrabani | Daniel Pressel | Ryan Price | Bhargav Srinivas Chinnari | Yeon-Jun Kim | Srinivas Bangalore
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

2021

pdf bib
A Hybrid Approach to Scalable and Robust Spoken Language Understanding in Enterprise Virtual Agents
Ryan Price | Mahnoosh Mehrabani | Narendra Gupta | Yeon-Jun Kim | Shahab Jalalvand | Minhua Chen | Yanjie Zhao | Srinivas Bangalore
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Spoken language understanding (SLU) extracts the intended mean- ing from a user utterance and is a critical component of conversational virtual agents. In enterprise virtual agents (EVAs), language understanding is substantially challenging. First, the users are infrequent callers who are unfamiliar with the expectations of a pre-designed conversation flow. Second, the users are paying customers of an enterprise who demand a reliable, consistent and efficient user experience when resolving their issues. In this work, we describe a general and robust framework for intent and entity extraction utilizing a hybrid of statistical and rule-based approaches. Our framework includes confidence modeling that incorporates information from all components in the SLU pipeline, a critical addition for EVAs to en- sure accuracy. Our focus is on creating accurate and scalable SLU that can be deployed rapidly for a large class of EVA applications with little need for human intervention.

2018

pdf bib
Practical Application of Domain Dependent Confidence Measurement for Spoken Language Understanding Systems
Mahnoosh Mehrabani | David Thomson | Benjamin Stern
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

Spoken Language Understanding (SLU), which extracts semantic information from speech, is not flawless, specially in practical applications. The reliability of the output of an SLU system can be evaluated using a semantic confidence measure. Confidence measures are a solution to improve the quality of spoken dialogue systems, by rejecting low-confidence SLU results. In this study we discuss real-world applications of confidence scoring in a customer service scenario. We build confidence models for three major types of dialogue states that are considered as different domains: how may I help you, number capture, and confirmation. Practical challenges to train domain-dependent confidence models, including data limitations, are discussed, and it is shown that feature engineering plays an important role to improve performance. We explore a wide variety of predictor features based on speech recognition, intent classification, and high-level domain knowledge, and find the combined feature set with the best rejection performance for each application.