Abstract
This paper compares how different machine learning classifiers can be used together with simple string matching and named entity recognition to detect locations in texts. We compare five different state-of-the-art machine learning classifiers in order to predict whether a sentence contains a location or not. Following this classification task, we use a string matching algorithm with a gazetteer to identify the exact index of a toponym within the sentence. We evaluate different approaches in terms of machine learning classifiers, text pre-processing and location extraction on the SemEval-2019 Task 12 dataset, compiled for toponym resolution in the bio-medical domain. Finally, we compare the results with our system that was previously submitted to the SemEval-2019 task evaluation.- Anthology ID:
- R19-1106
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 912–921
- Language:
- URL:
- https://aclanthology.org/R19-1106
- DOI:
- 10.26615/978-954-452-056-4_106
- Cite (ACL):
- Alistair Plum, Tharindu Ranasinghe, and Constantin Orasan. 2019. Toponym Detection in the Bio-Medical Domain: A Hybrid Approach with Deep Learning. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 912–921, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Toponym Detection in the Bio-Medical Domain: A Hybrid Approach with Deep Learning (Plum et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/R19-1106.pdf