基于相似度进行句子选择的机器阅读理解数据增强(Machine reading comprehension data Augmentation for sentence selection based on similarity)

Shuang Nie (聂双), Zheng Ye (叶正), Jun Qin (覃俊), Jing Liu (刘晶)


Abstract
“目前常见的机器阅读理解数据增强方法如回译,单独对文章或者问题进行数据增强,没有考虑文章、问题和选项三元组之间的联系。因此,本文探索了一种利用三元组联系进行文章句子筛选的数据增强方法,通过比较文章与问题以及选项的相似度,选取文章中与二者联系紧密的句子。同时为了使不同选项的三元组区别增大,我们选用了正则化Dropout的策略。实验结果表明,在RACE数据集上的准确率可提高3.8%。”
Anthology ID:
2022.ccl-1.51
Volume:
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Nanchang, China
Editors:
Maosong Sun (孙茂松), Yang Liu (刘洋), Wanxiang Che (车万翔), Yang Feng (冯洋), Xipeng Qiu (邱锡鹏), Gaoqi Rao (饶高琦), Yubo Chen (陈玉博)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
569–579
Language:
Chinese
URL:
https://preview.aclanthology.org/icon-24-ingestion/2022.ccl-1.51/
DOI:
Bibkey:
Cite (ACL):
Shuang Nie, Zheng Ye, Jun Qin, and Jing Liu. 2022. 基于相似度进行句子选择的机器阅读理解数据增强(Machine reading comprehension data Augmentation for sentence selection based on similarity). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 569–579, Nanchang, China. Chinese Information Processing Society of China.
Cite (Informal):
基于相似度进行句子选择的机器阅读理解数据增强(Machine reading comprehension data Augmentation for sentence selection based on similarity) (Nie et al., CCL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2022.ccl-1.51.pdf
Data
RACE