专业技术文本关键词抽取方法(Keyword Extraction on Professional Technical Text)

Xiangdong Ning (宁祥东), Bin Gong (龚斌), Lin Wan (万林), Yuqing Sun (孙宇清)


Abstract
“相关性和特异性对于专业技术文本关键词抽取问题至关重要,本文针对代码检索任务,综合语义信息、序列关系和句法结构提出了专业技术文本关键词抽取模型。采用预训练语言模型BERT提取文本抽象语义信息;采用序列关系和句法结构融合分析的方法构建语义关联图,以捕获词汇之间的长距离语义依赖关系;基于随机游走算法和词汇知识计算关键词权重,以兼顾关键词的相关性和特异性。在两个数据集和其他模型进行了性能比较,结果表明本模型抽取的关键词具有更好地相关性和特异性。”
Anthology ID:
2022.ccl-1.14
Volume:
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Nanchang, China
Editors:
Maosong Sun (孙茂松), Yang Liu (刘洋), Wanxiang Che (车万翔), Yang Feng (冯洋), Xipeng Qiu (邱锡鹏), Gaoqi Rao (饶高琦), Yubo Chen (陈玉博)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
143–154
Language:
Chinese
URL:
https://aclanthology.org/2022.ccl-1.14
DOI:
Bibkey:
Cite (ACL):
Xiangdong Ning, Bin Gong, Lin Wan, and Yuqing Sun. 2022. 专业技术文本关键词抽取方法(Keyword Extraction on Professional Technical Text). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 143–154, Nanchang, China. Chinese Information Processing Society of China.
Cite (Informal):
专业技术文本关键词抽取方法(Keyword Extraction on Professional Technical Text) (Ning et al., CCL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.ccl-1.14.pdf