Yuru Jiang

Also published as: 玉茹


2024

pdf bib
基于逻辑推理和多任务融合的认知刺激对话生成方法(Cognitive stimulation dialogue generation method based on logical reasoning and multi-task integration)
Yuru Jiang (蒋玉茹) | Mengyuan Li (李梦媛) | Yuyang Tao (陶宇阳) | Keming Qu (区可明) | Zepeng She (佘泽鹏) | Shuicai Shi (施水才)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“在全球老龄化背景下,带有认知刺激的对话系统是保持老年人认知健康的重要手段。中文认知刺激对话数据集(Chinese Cognitive Stimulation Conversation Dataset,CSConv)和模型构建的研究工作刚刚开始。本文将认知刺激对话生成视为一个多任务融合的逻辑思维推理过程,将情感分类任务、决策任务和对话回复生成任务间的逻辑关系,建模为一个推理过程,来引导大语言模型生成。针对决策任务,本文提出分层编码器结构的决策模型。决策实验结果表明,决策模型有效的提高了决策任务的准确率。针对多任务过程,本文提出多任务融合方法,将三个任务对应的模型结合在一起。生成实验结果表明,分类、决策及生成的多任务融合方法,显著提升了对话回复能力,证明了该方法的有效性和先进性。”

pdf bib
面向中文多方对话的机器阅读理解研究(Research on Machine Reading Comprehension for Chinese Multi-party Dialogues)
Yuru Jiang (蒋玉茹) | Yu Li (李宇) | Tingting Na (那婷婷) | Yangsen Zhang (张仰森)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“在机器阅读理解领域,处理和分析多方对话一直是一项具有挑战性的研究任务。鉴于中文语境下相关数据资源的缺乏,本研究构建了DialogueMRC数据集,旨在促进该领域的研究进展。DialogueMRC数据集作为首个面向中文多方对话的机器阅读理解数据集,包含705个多方对话实例,涵盖24451个话语单元以及8305个问答对。区别于以往的MRC数据集,DialogueMRC数据集强调深入理解动态的对话过程,对模型应对多方对话中的复杂性及篇章解析能力提出了更高的要求。为应对中文多方对话机器阅读理解的挑战,本研究提出了融合篇章结构感知能力的中文多方对话问答模型(DiscourseStructure-aware QA Model for Chinese Multi-party Dialogue,DSQA-CMD),该模型融合了问答和篇章解析任务,以提升对话上下文的理解能力。实验结果表明,相较于典型的基于微调的预训练语言模型,DSQA-CMD模型表现出明显优势,对比基于Longformer的方法,DSQA-CMD模型在MRC任务的F1和EM评价指标上分别提升了5.4%和10.0%;与当前主流的大型语言模型相比,本模型也展现了更佳的性能,表明了本文所提出方法的有效性。”

2022

pdf bib
The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues
Yuru Jiang | Yang Xu | Yuhang Zhan | Weikai He | Yilin Wang | Zixuan Xi | Meiyun Wang | Xinyu Li | Yu Li | Yanchao Yu
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with complex family-based human daily spoken conversations in Chinese. First, we introduced human annotation scheme for both global Character relationship map and character reference relationship. And then we generated the dialogue-based character relationship triples. The corpus annotates relationships between 140 entities in total. We also carried out a data exploration experiment by deploying a BERT-based model to extract character relationships on the CRECIL corpus and another existing relation extraction corpus (DialogRE (CITATION)).The results demonstrate that extracting character relationships is more challenging in CRECIL than in DialogRE.

2010

pdf bib
On Generalized-Topic-Based Chinese Discourse Structure
Rou Song | Yuru Jiang | Jingyi Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing