Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation

Jiong Cai, Shen Huang, Yong Jiang, Zeqi Tan, Pengjun Xie, Kewei Tu


Abstract
Data augmentation is an effective solution to improve model performance and robustness for low-resource named entity recognition (NER). However, synthetic data often suffer from poor diversity, which leads to performance limitations. In this paper, we propose a novel Graph Propagated Data Augmentation (GPDA) framework for Named Entity Recognition (NER), leveraging graph propagation to build relationships between labeled data and unlabeled natural texts. By projecting the annotations from the labeled text to the unlabeled text, the unlabeled texts are partially labeled, which has more diversity rather than synthetic annotated data. To strengthen the propagation precision, a simple search engine built on Wikipedia is utilized to fetch related texts of labeled data and to propagate the entity labels to them in the light of the anchor links. Besides, we construct and perform experiments on a real-world low-resource dataset of the E-commerce domain, which will be publicly available to facilitate the low-resource NER research. Experimental results show that GPDA presents substantial improvements over previous data augmentation methods on multiple low-resource NER datasets.
Anthology ID:
2023.acl-short.11
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
110–118
Language:
URL:
https://aclanthology.org/2023.acl-short.11
DOI:
10.18653/v1/2023.acl-short.11
Bibkey:
Cite (ACL):
Jiong Cai, Shen Huang, Yong Jiang, Zeqi Tan, Pengjun Xie, and Kewei Tu. 2023. Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 110–118, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation (Cai et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2023.acl-short.11.pdf
Video:
 https://preview.aclanthology.org/improve-issue-templates/2023.acl-short.11.mp4