Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

Afshin Rahimi, Timothy Baldwin, Trevor Cohn


Abstract
We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.
Anthology ID:
D17-1016
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–176
Language:
URL:
https://aclanthology.org/D17-1016
DOI:
10.18653/v1/D17-1016
Bibkey:
Cite (ACL):
Afshin Rahimi, Timothy Baldwin, and Trevor Cohn. 2017. Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 167–176, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks (Rahimi et al., EMNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/D17-1016.pdf
Video:
 https://vimeo.com/238228698
Code
 afshinrahimi/geomdn