Computational Etymology and Word Emergence

Winston Wu, David Yarowsky


Abstract
We developed an extensible, comprehensive Wiktionary parser that improves over several existing parsers. We predict the etymology of a word across the full range of etymology types and languages in Wiktionary, showing improvements over a strong baseline. We also model word emergence and show the application of etymology in modeling this phenomenon. We release our parser to further research in this understudied field.
Anthology ID:
2020.lrec-1.397
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3252–3259
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.397
DOI:
Bibkey:
Cite (ACL):
Winston Wu and David Yarowsky. 2020. Computational Etymology and Word Emergence. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3252–3259, Marseille, France. European Language Resources Association.
Cite (Informal):
Computational Etymology and Word Emergence (Wu & Yarowsky, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.lrec-1.397.pdf