Abstract
Bootstrapping for entity set expansion (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision. Recent end-to-end bootstrapping approaches have shown their advantages in information capturing and bootstrapping process modeling. However, due to the sparse supervision problem, previous end-to-end methods often only leverage information from near neighborhoods (local semantics) rather than those propagated from the co-occurrence structure of the whole corpus (global semantics). To address this issue, this paper proposes Global Bootstrapping Network (GBN) with the “pre-training and fine-tuning” strategies for effective learning. Specifically, it contains a global-sighted encoder to capture and encode both local and global semantics into entity embedding, and an attention-guided decoder to sequentially expand new entities based on these embeddings. The experimental results show that the GBN learned by “pre-training and fine-tuning” strategies achieves state-of-the-art performance on two bootstrapping datasets.- Anthology ID:
- 2020.findings-emnlp.331
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3705–3714
- Language:
- URL:
- https://aclanthology.org/2020.findings-emnlp.331
- DOI:
- 10.18653/v1/2020.findings-emnlp.331
- Cite (ACL):
- Lingyong Yan, Xianpei Han, Ben He, and Le Sun. 2020. Global Bootstrapping Neural Network for Entity Set Expansion. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3705–3714, Online. Association for Computational Linguistics.
- Cite (Informal):
- Global Bootstrapping Neural Network for Entity Set Expansion (Yan et al., Findings 2020)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2020.findings-emnlp.331.pdf
- Code
- lingyongyan/bootstrapping_pre-train
- Data
- DocRED, OntoNotes 5.0