Abstract
We present a fully automated workflow for phylogenetic reconstruction on large datasets, consisting of two novel methods, one for fast detection of cognates and one for fast Bayesian phylogenetic inference. Our results show that the methods take less than a few minutes to process language families that have so far required large amounts of time and computational power. Moreover, the cognates and the trees inferred from the method are quite close, both to gold standard cognate judgments and to expert language family trees. Given its speed and ease of application, our framework is specifically useful for the exploration of very large datasets in historical linguistics.- Anthology ID:
- P19-1627
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6225–6235
- Language:
- URL:
- https://aclanthology.org/P19-1627
- DOI:
- 10.18653/v1/P19-1627
- Cite (ACL):
- Taraka Rama and Johann-Mattis List. 2019. An Automated Framework for Fast Cognate Detection and Bayesian Phylogenetic Inference in Computational Historical Linguistics. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6225–6235, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- An Automated Framework for Fast Cognate Detection and Bayesian Phylogenetic Inference in Computational Historical Linguistics (Rama & List, ACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/P19-1627.pdf
- Code
- lingpy/bipskip