Abstract
“目前机器阅读理解在抽取语义完整的选项证据时存在诸多挑战。现有通过无监督方式进行证据抽取的工作主要分为两类,一是利用静态词向量,采用集束搜索迭代地提取相关句子;另一类是使用实例级监督方法,包括独立式证据抽取和端到端式证据抽取。前者处理流程上较为繁琐,后者在联合训练时存在不稳定性,直接导致模型性能难以稳定提升。在CCL23-Eval 任务9中,本文提出了一种基于重叠片段生成的自适应端到端证据抽取方法。该方法针对证据句边界不明确的问题,通过将文档划分为多个重叠的句子片段,并提取关键部分作为证据来实现整体语义的抽取。同时,将证据提取嵌入模块予以优化,实现了证据片段置信度自动调整。实验结果表明本文所提出方法能够极大地排除冗余内容干扰,仅需一个超参数即可稳定提升阅读理解模型性能,增强了模型鲁棒性。”- Anthology ID:
- 2023.ccl-3.32
- Volume:
- Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
- Month:
- August
- Year:
- 2023
- Address:
- Harbin, China
- Editors:
- Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 293–302
- Language:
- Chinese
- URL:
- https://aclanthology.org/2023.ccl-3.32
- DOI:
- Cite (ACL):
- Suzhe He, Chongsheng Yang, and Shumin Shi. 2023. CCL23-Eval 任务9系统报告:基于重叠片段生成增强阅读理解模型鲁棒性的方法(System Report for CCL23-Eval Task 9: Improving MRC Robustness with Overlapping Segments Generation for GCRC_advRobust). In Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 293–302, Harbin, China. Chinese Information Processing Society of China.
- Cite (Informal):
- CCL23-Eval 任务9系统报告:基于重叠片段生成增强阅读理解模型鲁棒性的方法(System Report for CCL23-Eval Task 9: Improving MRC Robustness with Overlapping Segments Generation for GCRC_advRobust) (He et al., CCL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.ccl-3.32.pdf