Developing A Hawaiian Corpus Toolkit for Data-Driven Language Learning

Joseph Winkie, Michol Miller, Winston Wu


Abstract
This paper presents the development of an online multimodal corpus toolkit designed for data-driven language learning in Hawaiian. The toolkit supports corpus linguistics analyses including concordance/KWIC (Key Word In Context) searches, frequency analysis, collocation analyses, and complex queries with n-grams and regex pattern matching. Specifically designed for educators, students, and parents within the Hawaiian community, this easy-to-use tool facilitates a data-driven language learning process by enabling users to explore authentic language data, identify patterns, and develop deeper understanding of Hawaiian language structures through computational methods. By integrating corpus-based approaches into language education, this toolkit contributes significantly to preserving and promoting Hawaiian language learning and supports the broader community’s efforts in language revitalization.
Anthology ID:
2026.computel-1.18
Volume:
Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Godfred Agyapong, Sarah Moeller, Antti Arppe, Ali Marashian, Daisy Rosenblum
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–176
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.computel-1.18/
DOI:
Bibkey:
Cite (ACL):
Joseph Winkie, Michol Miller, and Winston Wu. 2026. Developing A Hawaiian Corpus Toolkit for Data-Driven Language Learning. In Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9), pages 167–176, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Developing A Hawaiian Corpus Toolkit for Data-Driven Language Learning (Winkie et al., ComputEL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.computel-1.18.pdf
Supplementarymaterial:
 2026.computel-1.18.SupplementaryMaterial.txt