This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
YuruJiang
Also published as:
玉茹 蒋
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
“在机器阅读理解领域,处理和分析多方对话一直是一项具有挑战性的研究任务。鉴于中文语境下相关数据资源的缺乏,本研究构建了DialogueMRC数据集,旨在促进该领域的研究进展。DialogueMRC数据集作为首个面向中文多方对话的机器阅读理解数据集,包含705个多方对话实例,涵盖24451个话语单元以及8305个问答对。区别于以往的MRC数据集,DialogueMRC数据集强调深入理解动态的对话过程,对模型应对多方对话中的复杂性及篇章解析能力提出了更高的要求。为应对中文多方对话机器阅读理解的挑战,本研究提出了融合篇章结构感知能力的中文多方对话问答模型(DiscourseStructure-aware QA Model for Chinese Multi-party Dialogue,DSQA-CMD),该模型融合了问答和篇章解析任务,以提升对话上下文的理解能力。实验结果表明,相较于典型的基于微调的预训练语言模型,DSQA-CMD模型表现出明显优势,对比基于Longformer的方法,DSQA-CMD模型在MRC任务的F1和EM评价指标上分别提升了5.4%和10.0%;与当前主流的大型语言模型相比,本模型也展现了更佳的性能,表明了本文所提出方法的有效性。”
We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with complex family-based human daily spoken conversations in Chinese. First, we introduced human annotation scheme for both global Character relationship map and character reference relationship. And then we generated the dialogue-based character relationship triples. The corpus annotates relationships between 140 entities in total. We also carried out a data exploration experiment by deploying a BERT-based model to extract character relationships on the CRECIL corpus and another existing relation extraction corpus (DialogRE (CITATION)).The results demonstrate that extracting character relationships is more challenging in CRECIL than in DialogRE.