Applying Linguistic Expertise to LLMs for Educational Material Development in Indigenous Languages

Justin Vasselli, Arturo Martínez Peguero, Junehwan Sung, Taro Watanabe


Abstract
This paper presents our approach to the AmericasNLP 2024 Shared Task 2 as the JAJ (/dʒæz/) team. The task aimed at creating educational materials for indigenous languages, and we focused on Maya and Bribri. Given the unique linguistic features and challenges of these languages, and the limited size of the training datasets, we developed a hybrid methodology combining rule-based NLP methods with prompt-based techniques. This approach leverages the meta-linguistic capabilities of large language models, enabling us to blend broad, language-agnostic processing with customized solutions. Our approach lays a foundational framework that can be expanded to other indigenous languages languages in future work.
Anthology ID:
2024.americasnlp-1.24
Volume:
Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Manuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, Katharina von der Wense
Venues:
AmericasNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
201–208
Language:
URL:
https://aclanthology.org/2024.americasnlp-1.24
DOI:
Bibkey:
Cite (ACL):
Justin Vasselli, Arturo Martínez Peguero, Junehwan Sung, and Taro Watanabe. 2024. Applying Linguistic Expertise to LLMs for Educational Material Development in Indigenous Languages. In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), pages 201–208, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Applying Linguistic Expertise to LLMs for Educational Material Development in Indigenous Languages (Vasselli et al., AmericasNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.americasnlp-1.24.pdf
Supplementary material:
 2024.americasnlp-1.24.SupplementaryMaterial.zip