De-Bias for Generative Extraction in Unified NER Task

Shuai Zhang, Yongliang Shen, Zeqi Tan, Yiquan Wu, Weiming Lu


Abstract
Named entity recognition (NER) is a fundamental task to recognize specific types of entities from a given sentence. Depending on how the entities appear in the sentence, it can be divided into three subtasks, namely, Flat NER, Nested NER, and Discontinuous NER. Among the existing approaches, only the generative model can be uniformly adapted to these three subtasks. However, when the generative model is applied to NER, its optimization objective is not consistent with the task, which makes the model vulnerable to the incorrect biases. In this paper, we analyze the incorrect biases in the generation process from a causality perspective and attribute them to two confounders: pre-context confounder and entity-order confounder. Furthermore, we design Intra- and Inter-entity Deconfounding Data Augmentation methods to eliminate the above confounders according to the theory of backdoor adjustment. Experiments show that our method can improve the performance of the generative NER model in various datasets.
Anthology ID:
2022.acl-long.59
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
808–818
Language:
URL:
https://aclanthology.org/2022.acl-long.59
DOI:
10.18653/v1/2022.acl-long.59
Bibkey:
Cite (ACL):
Shuai Zhang, Yongliang Shen, Zeqi Tan, Yiquan Wu, and Weiming Lu. 2022. De-Bias for Generative Extraction in Unified NER Task. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 808–818, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
De-Bias for Generative Extraction in Unified NER Task (Zhang et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.acl-long.59.pdf
Data
GENIA