Neural Borrowing Detection with Monolingual Lexical Models

John Miller, Emanuel Pariasca, Cesar Beltran Castañon


Abstract
Identification of lexical borrowings, transfer of words between languages, is an essential practice of historical linguistics and a vital tool in analysis of language contact and cultural events in general. We seek to improve tools for automatic detection of lexical borrowings, focusing here on detecting borrowed words from monolingual wordlists. Starting with a recurrent neural lexical language model and competing entropies approach, we incorporate a more current Transformer based lexical model. From there we experiment with several different models and approaches including a lexical donor model with augmented wordlist. The Transformer model reduces execution time and minimally improves borrowing detection. The augmented donor model shows some promise. A substantive change in approach or model is needed to make significant gains in identification of lexical borrowings.
Anthology ID:
2021.ranlp-srw.16
Volume:
Proceedings of the Student Research Workshop Associated with RANLP 2021
Month:
September
Year:
2021
Address:
Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
109–117
Language:
URL:
https://aclanthology.org/2021.ranlp-srw.16
DOI:
Bibkey:
Cite (ACL):
John Miller, Emanuel Pariasca, and Cesar Beltran Castañon. 2021. Neural Borrowing Detection with Monolingual Lexical Models. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 109–117, Online. INCOMA Ltd..
Cite (Informal):
Neural Borrowing Detection with Monolingual Lexical Models (Miller et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2021.ranlp-srw.16.pdf