Abstract
Neural encoders of biomedical names are typically considered robust if representations can be effectively exploited for various downstream NLP tasks. To achieve this, encoders need to model domain-specific biomedical semantics while rivaling the universal applicability of pretrained self-supervised representations. Previous work on robust representations has focused on learning low-level distinctions between names of fine-grained biomedical concepts. These fine-grained concepts can also be clustered together to reflect higher-level, more general semantic distinctions, such as grouping the names nettle sting and tick-borne fever together under the description puncture wound of skin. It has not yet been empirically confirmed that training biomedical name encoders on fine-grained distinctions automatically leads to bottom-up encoding of such higher-level semantics. In this paper, we show that this bottom-up effect exists, but that it is still relatively limited. As a solution, we propose a scalable multi-task training regime for biomedical name encoders which can also learn robust representations using only higher-level semantic classes. These representations can generalise both bottom-up as well as top-down among various semantic hierarchies. Moreover, we show how they can be used out-of-the-box for improved unsupervised detection of hypernyms, while retaining robust performance on various semantic relatedness benchmarks.- Anthology ID:
- 2021.louhi-1.6
- Volume:
- Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis
- Month:
- April
- Year:
- 2021
- Address:
- online
- Editors:
- Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
- Venue:
- Louhi
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 49–58
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2021.louhi-1.6/
- DOI:
- Cite (ACL):
- Pieter Fivez, Simon Suster, and Walter Daelemans. 2021. Integrating Higher-Level Semantics into Robust Biomedical Name Representations. In Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pages 49–58, online. Association for Computational Linguistics.
- Cite (Informal):
- Integrating Higher-Level Semantics into Robust Biomedical Name Representations (Fivez et al., Louhi 2021)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2021.louhi-1.6.pdf
- Code
- clips/higherlevelsemantics
- Data
- IS-A, SemEval-2018 Task-9