Abstract
This paper describes LISN”’“s submission to the second track (open track) of the shared task on Interlinear Glossing for SIGMORPHON 2023. Our systems are based on Lost, a variation of linear Conditional Random Fields initially developed as a probabilistic translation model and then adapted to the glossing task. This model allows us to handle one of the main challenges posed by glossing, i.e. the fact that the list of potential labels for lexical morphemes is not fixed in advance and needs to be extended dynamically when labelling units are not seen in training. In such situations, we show how to make use of candidate lexical glosses found in the translation and discuss how such extension affects the training and inference procedures. The resulting automatic glossing systems prove to yield very competitive results, especially in low-resource settings.- Anthology ID:
- 2023.sigmorphon-1.21
- Volume:
- Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- SIGMORPHON
- SIG:
- SIGMORPHON
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 202–208
- Language:
- URL:
- https://aclanthology.org/2023.sigmorphon-1.21
- DOI:
- Cite (ACL):
- Shu Okabe and François Yvon. 2023. LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing. In Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 202–208, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing (Okabe & Yvon, SIGMORPHON 2023)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2023.sigmorphon-1.21.pdf