TIMBERT: Toponym Identifier For The Medical Domain Based on BERT

MohammadReza Davari, Leila Kosseim, Tien Bui


Abstract
In this paper, we propose an approach to automate the process of place name detection in the medical domain to enable epidemiologists to better study and model the spread of viruses. We created a family of Toponym Identification Models based on BERT (TIMBERT), in order to learn in an end-to-end fashion the mapping from an input sentence to the associated sentence labeled with toponyms. When evaluated with the SemEval 2019 task 12 test set (Weissenbacher et al., 2019), our best TIMBERT model achieves an F1 score of 90.85%, a significant improvement compared to the state-of-the-art of 89.13% (Wang et al., 2019).
Anthology ID:
2020.coling-main.58
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
662–668
Language:
URL:
https://aclanthology.org/2020.coling-main.58
DOI:
10.18653/v1/2020.coling-main.58
Bibkey:
Cite (ACL):
MohammadReza Davari, Leila Kosseim, and Tien Bui. 2020. TIMBERT: Toponym Identifier For The Medical Domain Based on BERT. In Proceedings of the 28th International Conference on Computational Linguistics, pages 662–668, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
TIMBERT: Toponym Identifier For The Medical Domain Based on BERT (Davari et al., COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.coling-main.58.pdf
Data
Penn Treebank