@inproceedings{verma-etal-2020-sthal,
    title = "{STHAL}: Location-mention Identification in Tweets of {I}ndian-context",
    author = "Verma, Kartik  and
      Sinha, Shobhit  and
      Akhtar, Md. Shad  and
      Goyal, Vikram",
    editor = "Bhattacharyya, Pushpak  and
      Sharma, Dipti Misra  and
      Sangal, Rajeev",
    booktitle = "Proceedings of the 17th International Conference on Natural Language Processing (ICON)",
    month = dec,
    year = "2020",
    address = "Indian Institute of Technology Patna, Patna, India",
    publisher = "NLP Association of India (NLPAI)",
    url = "https://preview.aclanthology.org/ingest-emnlp/2020.icon-main.52/",
    pages = "379--383",
    abstract = "We investigate the problem of extracting Indian-locations from a given crowd-sourced textual dataset. The problem of extracting fine-grained Indian-locations has many challenges. One challenge in the task is to collect relevant dataset from the crowd-sourced platforms that contain locations. The second challenge lies in extracting the location entities from the collected data. We provide an in-depth review of the information collection process and our annotation guidelines such that a reliable dataset annotation is guaranteed. We evaluate many recent algorithms and models, including Conditional Random fields (CRF), Bi-LSTM-CNN and BERT (Bidirectional Encoder Representations from Transformers), on our developed dataset named . The study shows the best F1-score of 72.49{\%} for BERT, followed by Bi-LSTM-CNN and CRF. As a result of our work, we prepare a publicly-available annotated dataset of Indian geolocations that can be used by the research community. Code and dataset are available at \url{https://github.com/vkartik2k/STHAL}."
}Markdown (Informal)
[STHAL: Location-mention Identification in Tweets of Indian-context](https://preview.aclanthology.org/ingest-emnlp/2020.icon-main.52/) (Verma et al., ICON 2020)
ACL