Integrating Higher-Level Semantics into Robust Biomedical Name Representations

Pieter Fivez, Simon Suster, Walter Daelemans


Abstract
Neural encoders of biomedical names are typically considered robust if representations can be effectively exploited for various downstream NLP tasks. To achieve this, encoders need to model domain-specific biomedical semantics while rivaling the universal applicability of pretrained self-supervised representations. Previous work on robust representations has focused on learning low-level distinctions between names of fine-grained biomedical concepts. These fine-grained concepts can also be clustered together to reflect higher-level, more general semantic distinctions, such as grouping the names nettle sting and tick-borne fever together under the description puncture wound of skin. It has not yet been empirically confirmed that training biomedical name encoders on fine-grained distinctions automatically leads to bottom-up encoding of such higher-level semantics. In this paper, we show that this bottom-up effect exists, but that it is still relatively limited. As a solution, we propose a scalable multi-task training regime for biomedical name encoders which can also learn robust representations using only higher-level semantic classes. These representations can generalise both bottom-up as well as top-down among various semantic hierarchies. Moreover, we show how they can be used out-of-the-box for improved unsupervised detection of hypernyms, while retaining robust performance on various semantic relatedness benchmarks.
Anthology ID:
2021.louhi-1.6
Volume:
Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis
Month:
April
Year:
2021
Address:
online
Editors:
Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–58
Language:
URL:
https://preview.aclanthology.org/icon-24-ingestion/2021.louhi-1.6/
DOI:
Bibkey:
Cite (ACL):
Pieter Fivez, Simon Suster, and Walter Daelemans. 2021. Integrating Higher-Level Semantics into Robust Biomedical Name Representations. In Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pages 49–58, online. Association for Computational Linguistics.
Cite (Informal):
Integrating Higher-Level Semantics into Robust Biomedical Name Representations (Fivez et al., Louhi 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2021.louhi-1.6.pdf
Code
 clips/higherlevelsemantics
Data
IS-ASemEval-2018 Task-9