A finite-state morphological analyser for Highland Puebla Nahuatl

Robert Pugh, Francis Tyers


Abstract
This paper describes the development of a free/open-source finite-state morphologicaltransducer for Highland Puebla Nahuatl, a Uto-Aztecan language spoken in and around the stateof Puebla in Mexico. The finite-state toolkit used for the work is the Helsinki Finite-StateToolkit (HFST); we use the lexc formalism for modelling the morphotactics and twol formal-ism for modelling morphophonological alternations. An evaluation is presented which showsthat the transducer has a reasonable coveragearound 90%on freely-available corpora of the language, and high precisionover 95%on a manually verified test set
Anthology ID:
2023.americasnlp-1.12
Volume:
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Manuel Mager, Abteen Ebrahimi, Arturo Oncevay, Enora Rice, Shruti Rijhwani, Alexis Palmer, Katharina Kann
Venue:
AmericasNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
103–108
Language:
URL:
https://aclanthology.org/2023.americasnlp-1.12
DOI:
10.18653/v1/2023.americasnlp-1.12
Bibkey:
Cite (ACL):
Robert Pugh and Francis Tyers. 2023. A finite-state morphological analyser for Highland Puebla Nahuatl. In Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 103–108, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A finite-state morphological analyser for Highland Puebla Nahuatl (Pugh & Tyers, AmericasNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp22-frontmatter/2023.americasnlp-1.12.pdf