Lidiia Ostyakova
2026
DiscoRAG: A Discourse-Aware Agent for Query-Based Summarization of Long Documents
Alexander Chernyavskiy | Lidiia Ostyakova | Dmitry Ilvovsky
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Alexander Chernyavskiy | Lidiia Ostyakova | Dmitry Ilvovsky
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Query-based summarization of long documents is often tackled with retrieval-augmented generation (RAG). However, conventional RAG models exhibit limitations when applied to narrative texts, where crucial evidence is often implicit and distributed. This exposes a distinct class of “discourse-aware” queries that require specialized, structure-aware models. To address this, we introduce DiscoRAG, a framework that leverages Rhetorical Structure Theory (RST). By modeling the document as a discourse tree, DiscoRAG navigates its structure, explicitly using rhetorical relations to focus on and aggregate evidence from globally related segments. Furthermore, our pipeline integrates a classifier that assesses query complexity to dynamically select the most efficient retrieval strategy. We evaluate our DiscoRAG against standard and extended-context RAG pipelines on the SQuALITY dataset, which we release augmented with questions requiring deep discourse reasoning and integration of the global narrative. Our results demonstrate that this method sizeably outperforms these baselines, demonstrating its superior ability to assemble a coherent, contextually rich evidence base by interpreting the global narrative structure rather than relying on local semantic similarity.
2024
GroundHog: Dialogue Generation using Multi-Grained Linguistic Input
Alexander Chernyavskiy | Lidiia Ostyakova | Dmitry Ilvovsky
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)
Alexander Chernyavskiy | Lidiia Ostyakova | Dmitry Ilvovsky
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)
Recent language models have significantly boosted conversational AI by enabling fast and cost-effective response generation in dialogue systems. However, dialogue systems based on neural generative approaches often lack truthfulness, reliability, and the ability to analyze the dialogue flow needed for smooth and consistent conversations with users. To address these issues, we introduce GroundHog, a modified BART architecture, to capture long multi-grained inputs gathered from various factual and linguistic sources, such as Abstract Meaning Representation, discourse relations, sentiment, and grounding information. For experiments, we present an automatically collected dataset from Reddit that includes multi-party conversations devoted to movies and TV series. The evaluation encompasses both automatic evaluation metrics and human evaluation. The obtained results demonstrate that using several linguistic inputs has the potential to enhance dialogue consistency, meaningfulness, and overall generation quality, even for automatically annotated data. We also provide an analysis that highlights the importance of individual linguistic features in interpreting the observed enhancements.
2023
ChatGPT vs. Crowdsourcing vs. Experts: Annotating Open-Domain Conversations with Speech Functions
Lidiia Ostyakova | Veronika Smilga | Kseniia Petukhova | Maria Molchanova | Daniel Kornev
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Lidiia Ostyakova | Veronika Smilga | Kseniia Petukhova | Maria Molchanova | Daniel Kornev
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
This paper deals with the task of annotating open-domain conversations with speech functions. We propose a semi-automated method for annotating dialogs following the topic-oriented, multi-layered taxonomy of speech functions with the use of hierarchical guidelines using Large Language Models. These guidelines comprise simple questions about the topic and speaker change, sentence types, pragmatic aspects of the utterance, and examples that aid untrained annotators in understanding the taxonomy. We compare the results of dialog annotation performed by experts, crowdsourcing workers, and ChatGPT. To improve the performance of ChatGPT, several experiments utilising different prompt engineering techniques were conducted. We demonstrate that in some cases large language models can achieve human-like performance following a multi-step tree-like annotation pipeline on complex discourse annotation, which is usually challenging and costly in terms of time and money when performed by humans.