Jata MacCabe


2023

pdf
Glossy Bytes: Neural Glossing using Subword Encoding
Ziggy Cross | Michelle Yun | Ananya Apparaju | Jata MacCabe | Garrett Nicolai | Miikka Silfverberg
Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents several different neural subword modelling based approaches to interlinear glossing for seven under-resourced languages as a part of the 2023 SIGMORPHON shared task on interlinear glossing. We experiment with various augmentation and tokenization strategies for both the open and closed tracks of data. We found that while byte-level models may perform well for greater amounts of data, character based approaches remain competitive in their performance in lower resource settings.