Abstract
Identification of lexical borrowings, transfer of words between languages, is an essential practice of historical linguistics and a vital tool in analysis of language contact and cultural events in general. We seek to improve tools for automatic detection of lexical borrowings, focusing here on detecting borrowed words from monolingual wordlists. Starting with a recurrent neural lexical language model and competing entropies approach, we incorporate a more current Transformer based lexical model. From there we experiment with several different models and approaches including a lexical donor model with augmented wordlist. The Transformer model reduces execution time and minimally improves borrowing detection. The augmented donor model shows some promise. A substantive change in approach or model is needed to make significant gains in identification of lexical borrowings.- Anthology ID:
- 2021.ranlp-srw.16
- Volume:
- Proceedings of the Student Research Workshop Associated with RANLP 2021
- Month:
- September
- Year:
- 2021
- Address:
- Online
- Editors:
- Souhila Djabri, Dinara Gimadi, Tsvetomila Mihaylova, Ivelina Nikolova-Koleva
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 109–117
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-srw.16
- DOI:
- Cite (ACL):
- John Miller, Emanuel Pariasca, and Cesar Beltran Castañon. 2021. Neural Borrowing Detection with Monolingual Lexical Models. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 109–117, Online. INCOMA Ltd..
- Cite (Informal):
- Neural Borrowing Detection with Monolingual Lexical Models (Miller et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/2021.ranlp-srw.16.pdf