Libang Wang

Also published as: 立帮王

2024

pdf bib abs
NNP-TDGM: 基于最近邻提示表征的术语DEF生成模型(NNP-TDGM: Nearest Neighbor Prompt Term DEF Generation Model)
Sijia Shen (沈思嘉) | Peiyan Wang (王裴岩) | Shengren Wang (王胜任) | Libang Wang (王立帮)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“该文研究基于HowNet的知识库描述语言语法体系的术语DEF自动生成问题,提出基于最近邻提示表征的术语DEF生成模型(NNP-TDGM),将训练集中的术语DEF构造为外显记忆集,在解码器生成(首)义原或关系时,检索与待预测术语概念结构相同或相近的术语所蕴含的核心概念,重要属性和关系类型,辅助模型完成DEF的生成,解决解码器在低频样本上训练不充分的问题。另外,通过提示预训练语言模型获得术语及术语定义内蕴涵概念信息的语义表征向量,改善编码器表征能力不足的问题。经实验验证NNP-TDGM模型生成术语DEF的义原-关系-义原三元组F1值达到31.84%、关系F1值达到53.12%、义原F1值达到51.55%、首义原F1值达到68.53%,相对于基线方法分别提升了3.38%,1.45%,1.08%,0.48%。”

pdf bib abs
面向中文实体识别的Transformers模型句子级非对抗鲁棒性研究(On Sentence-level Non-adversarial Robustness of Chinese Named Entity Recognition with Transformers Model)
Libang Wang (王立帮) | Peiyan Wang (王裴岩) | Sijia Shen (沈思嘉)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“基于Transformers的中文实体识别模型在标准实体识别基准测试中取得了卓越性能,其鲁棒性研究也受到了广泛关注。当前,中文实体识别模型在实际部署中所面临的句子级非对抗鲁棒性问题研究不足,该文针对该问题开展了研究。首先,该文从理论上分析并发现了Transformer中自注意力、相对位置嵌入及绝对位置嵌入对模型鲁棒性的负面影响。之后,提出了实体标签增强和滑动窗口约束的鲁棒性增强方法,并从理论上证明了提出方法能够提升Transformers模型的实体识别鲁棒性。最后,通过在3个中文数据集的实验,研究了4种基于Transformer的实体识别模型的脆弱性,所提出方法使模型的鲁棒性F1值提升最高可达4.95%。”

pdf bib abs
A Corpus and Method for Chinese Named Entity Recognition in Manufacturing
Ruiting Li | Peiyan Wang | Libang Wang | Danqingxin Yang | Dongfeng Cai
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Manufacturing specifications are documents entailing different techniques, processes, and components involved in manufacturing. There is a growing demand for named entity recognition (NER) resources and techniques for manufacturing-specific named entities, with the development of smart manufacturing. In this paper, we introduce a corpus of Chinese manufacturing specifications, named MS-NERC, including 4,424 sentences and 16,383 entities. We also propose an entity recognizer named Trainable State Transducer (TST), which is initialized with a finite state transducer describing the morphological patterns of entities. It can directly recognize entities based on prior morphological knowledge without training. Experimental results show that TST achieves an overall 82.05% F1 score for morphological-specific entities in zero-shot. TST can be improved through training, the result of which outperforms neural methods in few-shot and rich-resource. We believe that our corpus and model will be valuable resources for NER research not only in manufacturing but also in other low-resource domains.

Co-authors

Danqingxin Yang 1

Venues

Fix data

Libang Wang

Fixing paper assignments

2024

Co-authors

Venues