Abstract
This paper presents a finite-state morphological analyzer for the Gitksan language. The analyzer draws from a 1250-token Eastern dialect wordlist. It is based on finite-state technology and additionally includes two extensions which can provide analyses for out-of-vocabulary words: rules for generating predictable dialect variants, and a neural guesser component. The pre-neural analyzer, tested against interlinear-annotated texts from multiple dialects, achieves coverage of (75-81%), and maintains high precision (95-100%). The neural extension improves coverage at the cost of lowered precision.- Anthology ID:
- 2021.sigmorphon-1.21
- Volume:
- Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Garrett Nicolai, Kyle Gorman, Ryan Cotterell
- Venue:
- SIGMORPHON
- SIG:
- SIGMORPHON
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 188–197
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.sigmorphon-1.21/
- DOI:
- 10.18653/v1/2021.sigmorphon-1.21
- Cite (ACL):
- Clarissa Forbes, Garrett Nicolai, and Miikka Silfverberg. 2021. An FST morphological analyzer for the Gitksan language. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 188–197, Online. Association for Computational Linguistics.
- Cite (Informal):
- An FST morphological analyzer for the Gitksan language (Forbes et al., SIGMORPHON 2021)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.sigmorphon-1.21.pdf