Ayano Takeuchi


2024

Automatic extraction of geographic information, including Location Referring Expressions (LREs), can aid humanities research in analyzing large collections of historical texts. In this study, to investigate how accurate pretrained Transformer language models (LMs) can extract LREs from historical texts, we evaluate two representative types of LMs, namely, masked language model and causal language model, using early modern and contemporary Japanese datasets. Our experimental results demonstrated the potential of contemporary LMs for historical texts, but also suggest the need for further model enhancement, such as pretraining on historical texts.