Development of Natural Language Processing Tools for Cook Islands Māori

Rolando Coto Solano, Sally Akevai Nicholas, Samantha Wray


Abstract
This paper presents three ongoing projects for NLP in Cook Islands Maori: Untrained Forced Alignment (approx. 9% error when detecting the center of words), speech-to-text (37% WER in the best trained models) and POS tagging (92% accuracy for the best performing model). Included as part of these projects are new resources filling in a gap in Australasian languages, including gold standard POS-tagged written corpora, transcribed speech corpora, time-aligned corpora down to the level of phonemes. These are part of efforts to accelerate the documentation of Cook Islands Maori and to increase its vitality amongst its users.
Anthology ID:
U18-1003
Volume:
Proceedings of the Australasian Language Technology Association Workshop 2018
Month:
December
Year:
2018
Address:
Dunedin, New Zealand
Editors:
Sunghwan Mac Kim, Xiuzhen (Jenny) Zhang
Venue:
ALTA
SIG:
Publisher:
Note:
Pages:
26–33
Language:
URL:
https://aclanthology.org/U18-1003
DOI:
Bibkey:
Cite (ACL):
Rolando Coto Solano, Sally Akevai Nicholas, and Samantha Wray. 2018. Development of Natural Language Processing Tools for Cook Islands Māori. In Proceedings of the Australasian Language Technology Association Workshop 2018, pages 26–33, Dunedin, New Zealand.
Cite (Informal):
Development of Natural Language Processing Tools for Cook Islands Māori (Solano et al., ALTA 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/U18-1003.pdf