HyperNetwork-based Decoupling to Improve Model Generalization for Few-Shot Relation Extraction

Liang Zhang, Chulun Zhou, Fandong Meng, Jinsong Su, Yidong Chen, Jie Zhou


Abstract
Few-shot relation extraction (FSRE) aims to train a model that can deal with new relations using only a few labeled examples. Most existing studies employ Prototypical Networks for FSRE, which usually overfits the relation classes in the training set and cannot generalize well to unseen relations. By investigating the class separation of an FSRE model, we find that model upper layers are prone to learn relation-specific knowledge. Therefore, in this paper, we propose a HyperNetwork-based Decoupling approach to improve the generalization of FSRE models. Specifically, our model consists of an encoder, a network generator (for producing relation classifiers) and the produced-then-finetuned classifiers for every N-way-K-shot episode. Meanwhile, we design a two-step training framework along with a class-agnostic aligner, in which the generated classifiers focus on acquiring relation-specific knowledge and the encoder is encouraged to learn more general relation knowledge. In this way, the roles of upper and lower layers in an FSRE model are explicitly decoupled, thus enhancing its generalizing capability during testing. Experiments on two public datasets demonstrate the effectiveness of our method.
Anthology ID:
2023.emnlp-main.381
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6213–6223
Language:
URL:
https://aclanthology.org/2023.emnlp-main.381
DOI:
10.18653/v1/2023.emnlp-main.381
Bibkey:
Cite (ACL):
Liang Zhang, Chulun Zhou, Fandong Meng, Jinsong Su, Yidong Chen, and Jie Zhou. 2023. HyperNetwork-based Decoupling to Improve Model Generalization for Few-Shot Relation Extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6213–6223, Singapore. Association for Computational Linguistics.
Cite (Informal):
HyperNetwork-based Decoupling to Improve Model Generalization for Few-Shot Relation Extraction (Zhang et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2023.emnlp-main.381.pdf
Video:
 https://preview.aclanthology.org/dois-2013-emnlp/2023.emnlp-main.381.mp4