Enhancing Geocoding of Adjectival Toponyms With Heuristics

Breno Dourado Sá, Ticiana Coelho da Silva, Jose Antonio Fernandes de Macedo


Abstract
Unstructured text documents such as news and blogs often present references to places. Those references, called toponyms, can be used in various applications like disaster warning and touristic planning. However, obtaining the correct coordinates for toponyms, called geocoding, is not easy since it’s common for places to have the same name as other locations. The process becomes even more challenging when toponyms appear in adjectival form, as they are different from the place’s actual name. This paper addresses the geocoding task and aims to improve, through a heuristic approach, the process for adjectival toponyms. So first, a baseline geocoder is defined through experimenting with a set of heuristics. After that, the baseline is enhanced by adding a normalization step to map adjectival toponyms to their noun form at the beginning of the geocoding process. The results show improved performance for the enhanced geocoder compared to the baseline and other geocoders.
Anthology ID:
2022.politicalnlp-1.6
Volume:
Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
PoliticalNLP
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
37–45
Language:
URL:
https://aclanthology.org/2022.politicalnlp-1.6
DOI:
Bibkey:
Cite (ACL):
Breno Dourado Sá, Ticiana Coelho da Silva, and Jose Antonio Fernandes de Macedo. 2022. Enhancing Geocoding of Adjectival Toponyms With Heuristics. In Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences, pages 37–45, Marseille, France. European Language Resources Association.
Cite (Informal):
Enhancing Geocoding of Adjectival Toponyms With Heuristics (Dourado Sá et al., PoliticalNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.politicalnlp-1.6.pdf