Learning Low-Resource Languages Through NLP-Driven Flashcards: A Case Study of Hokkien in Language Learning Applications

Tai Zhang, Lucie Yang, Erin Chen, Karen Riani, Jessica Zipf, Mariana Shimabukuro, En-Shiun Annie Lee


Abstract
LangLearn is an open-source framework designed to facilitate autonomous learning of low-resource languages (LRL). By combining a language-agnostic approach with AI-enhanced flashcards, LangLearn empowers users to generate custom flashcards for their vocabulary, while offering structured learning through both pre-curated and self-curated decks. The framework integrates six key components: the word definition, corresponding Hanji characters, romanization with numeric tones, audio pronunciation, a sample sentence, as well as a contextual AI-generated image. LangLearn currently supports English and Taiwanese Hokkien (a variety of Southern Min), with plans to extend support for other dialects. Our preliminary study demonstrates that LangLearn positively empowers users to engage with LRLs using their vocabulary preferences, with a comprehensive user study currently underway. LangLearn’s modular structure enables future expansion, including ASR-based pronunciation practice. The code is available at https://github.com/HokkienTranslation/HokkienTranslation.
Anthology ID:
2025.naacl-demo.26
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Nouha Dziri, Sean (Xiang) Ren, Shizhe Diao
Venues:
NAACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
303–312
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-demo.26/
DOI:
Bibkey:
Cite (ACL):
Tai Zhang, Lucie Yang, Erin Chen, Karen Riani, Jessica Zipf, Mariana Shimabukuro, and En-Shiun Annie Lee. 2025. Learning Low-Resource Languages Through NLP-Driven Flashcards: A Case Study of Hokkien in Language Learning Applications. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations), pages 303–312, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Learning Low-Resource Languages Through NLP-Driven Flashcards: A Case Study of Hokkien in Language Learning Applications (Zhang et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-demo.26.pdf