Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT
Zaiqiao Meng, Fangyu Liu, Thomas Clark, Ehsan Shareghi, Nigel Collier
Abstract
Infusing factual knowledge into pre-trained models is fundamental for many knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it into smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters. To leverage the overall factual knowledge for a target task, these sub-graph adapters are further fine-tuned along with the underlying BERT through a mixture layer. We evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on six downstream tasks (inc. NLI, QA, Classification), and the results show that our MoP consistently enhances the underlying BERTs in task performance, and achieves new SOTA performances on five evaluated datasets.- Anthology ID:
- 2021.emnlp-main.383
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4672–4681
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.383
- DOI:
- 10.18653/v1/2021.emnlp-main.383
- Cite (ACL):
- Zaiqiao Meng, Fangyu Liu, Thomas Clark, Ehsan Shareghi, and Nigel Collier. 2021. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4672–4681, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT (Meng et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2021.emnlp-main.383.pdf
- Code
- cambridgeltl/mop
- Data
- MedQA, PubMedQA