基于预训练语言模型的案件要素识别方法(A Method for Case Factor Recognition Based on Pre-trained Language Models)
Haishun Liu (刘海顺), Lei Wang (王雷), Yanguang Chen (陈彦光), Shuchen Zhang (张书晨), Yuanyuan Sun (孙媛媛), Hongfei Lin (林鸿飞)
Abstract
案件要素识别指将案件描述中重要事实描述自动抽取出来,并根据领域专家设计的要素体系进行分类,是智慧司法领域的重要研究内容。基于传统神经网络的文本编码难以提取深层次特征,基于阈值的多标签分类难以捕获标签间依赖关系,因此本文提出了基于预训练语言模型的多标签文本分类模型。该模型采用以Layer-attentive策略进行特征融合的语言模型作为编码器,使用基于LSTM的序列生成模型作为解码器。在“CAIL2019”数据集上进行实验,该方法比基于循环神经网络的算法在F1值上最高可提升7.6%,在相同超参数设置下比基础语言模型(BERT)提升约3.2%。- Anthology ID:
- 2020.ccl-1.69
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Editors:
- Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 743–753
- Language:
- Chinese
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2020.ccl-1.69/
- DOI:
- Cite (ACL):
- Haishun Liu, Lei Wang, Yanguang Chen, Shuchen Zhang, Yuanyuan Sun, and Hongfei Lin. 2020. 基于预训练语言模型的案件要素识别方法(A Method for Case Factor Recognition Based on Pre-trained Language Models). In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 743–753, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 基于预训练语言模型的案件要素识别方法(A Method for Case Factor Recognition Based on Pre-trained Language Models) (Liu et al., CCL 2020)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2020.ccl-1.69.pdf