Where are you from? Geolocating Speech and Applications to Language Identification
Patrick Foley, Matthew Wiesner, Bismarck Odoom, Leibny Paola Garcia Perera, Kenton Murray, Philipp Koehn
Abstract
We train models to answer the question, Where are you from? and show how such models can be repurposed for language identification (LID). To our knowledge, this paper is the first to introduce data sources, methods and models to tackle the task of geolocation of speech at a global scale, and the first to explore using geolocation as a proxy-task for LID. Specifically, we explore whether radio broadcasts with known origin can be used to train regression and classification-based models for geolocating speech. We build models on top of self-supervised pretrained models, using attention pooling to qualitatively verify that the model geolocates the speech itself, and not other channel artifacts.The best geolocation models localize speaker origin to around 650km. We confirm the value of speech geolocation as a proxy task by using speech geolocation models for zero-shot LID. Finally, we show that fine-tuning geolocation models for LID outperforms fine-tuning pretrained Wav2Vec2.0 models, and achieves state-of-the-art performance on the FLEURS benchmark.- Anthology ID:
- 2024.naacl-long.286
- Volume:
- Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5114–5126
- Language:
- URL:
- https://aclanthology.org/2024.naacl-long.286
- DOI:
- 10.18653/v1/2024.naacl-long.286
- Cite (ACL):
- Patrick Foley, Matthew Wiesner, Bismarck Odoom, Leibny Paola Garcia Perera, Kenton Murray, and Philipp Koehn. 2024. Where are you from? Geolocating Speech and Applications to Language Identification. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5114–5126, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Where are you from? Geolocating Speech and Applications to Language Identification (Foley et al., NAACL 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.naacl-long.286.pdf