Abstract
Finite-state approaches to morphological analysis have been shown to improve the performance of natural language processing systems for polysynthetic languages, in-which words are generally composed of many morphemes, for tasks such as language modelling (Schwartz et al., 2020). However, finite-state morphological analyzers are expensive to construct and require expert knowledge of a language’s structure. Currently, there is no broad-coverage finite-state model of morphology for Wolastoqey, also known as Passamaquoddy-Maliseet, an endangered low-resource Algonquian language. As this is the case, in this paper, we investigate using two unsupervised models, MorphAGram and Morfessor, to obtain morphological segmentations for Wolastoqey. We train MorphAGram and Morfessor models on a small corpus of Wolastoqey words and evaluate using two an notated datasets. Our results indicate that MorphAGram outperforms Morfessor for morphological segmentation of Wolastoqey.- Anthology ID:
- 2022.sigul-1.20
- Volume:
- Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Venue:
- SIGUL
- SIG:
- SIGUL
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 155–160
- Language:
- URL:
- https://aclanthology.org/2022.sigul-1.20
- DOI:
- Cite (ACL):
- Diego Bear and Paul Cook. 2022. Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey. In Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, pages 155–160, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey (Bear & Cook, SIGUL 2022)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2022.sigul-1.20.pdf