Boran Hao


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2020

pdf bib
Enhancing Clinical BERT Embedding using a Biomedical Knowledge Base
Boran Hao | Henghui Zhu | Ioannis Paschalidis
Proceedings of the 28th International Conference on Computational Linguistics

Domain knowledge is important for building Natural Language Processing (NLP) systems for low-resource settings, such as in the clinical domain. In this paper, a novel joint training method is introduced for adding knowledge base information from the Unified Medical Language System (UMLS) into language model pre-training for some clinical domain corpus. We show that in three different downstream clinical NLP tasks, our pre-trained language model outperforms the corresponding model with no knowledge base information and other state-of-the-art models. Specifically, in a natural language inference task applied to clinical texts, our knowledge base pre-training approach improves accuracy by up to 1.7%, whereas in clinical name entity recognition tasks, the F1-score improves by up to 1.0%. The pre-trained models are available at https://github.com/noc-lab/clinical-kb-bert.