STHAL: Location-mention Identification in Tweets of Indian-context

Kartik Verma; Shobhit Sinha; Md. Shad Akhtar; Vikram Goyal

STHAL: Location-mention Identification in Tweets of Indian-context

Kartik Verma, Shobhit Sinha, Md. Shad Akhtar, Vikram Goyal

Abstract

We investigate the problem of extracting Indian-locations from a given crowd-sourced textual dataset. The problem of extracting fine-grained Indian-locations has many challenges. One challenge in the task is to collect relevant dataset from the crowd-sourced platforms that contain locations. The second challenge lies in extracting the location entities from the collected data. We provide an in-depth review of the information collection process and our annotation guidelines such that a reliable dataset annotation is guaranteed. We evaluate many recent algorithms and models, including Conditional Random fields (CRF), Bi-LSTM-CNN and BERT (Bidirectional Encoder Representations from Transformers), on our developed dataset named . The study shows the best F1-score of 72.49% for BERT, followed by Bi-LSTM-CNN and CRF. As a result of our work, we prepare a publicly-available annotated dataset of Indian geolocations that can be used by the research community. Code and dataset are available at https://github.com/vkartik2k/STHAL.

Anthology ID:: 2020.icon-main.52
Volume:: Proceedings of the 17th International Conference on Natural Language Processing (ICON)
Month:: December
Year:: 2020
Address:: Indian Institute of Technology Patna, Patna, India
Editors:: Pushpak Bhattacharyya, Dipti Misra Sharma, Rajeev Sangal
Venue:: ICON
SIG:
Publisher:: NLP Association of India (NLPAI)
Note:
Pages:: 379–383
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2020.icon-main.52/
DOI:
Bibkey:
Cite (ACL):: Kartik Verma, Shobhit Sinha, Md. Shad Akhtar, and Vikram Goyal. 2020. STHAL: Location-mention Identification in Tweets of Indian-context. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 379–383, Indian Institute of Technology Patna, Patna, India. NLP Association of India (NLPAI).
Cite (Informal):: STHAL: Location-mention Identification in Tweets of Indian-context (Verma et al., ICON 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2020.icon-main.52.pdf

PDF Cite Search Fix data