Tai Zhang


2025

pdf bib
Learning Low-Resource Languages Through NLP-Driven Flashcards: A Case Study of Hokkien in Language Learning Applications
Tai Zhang | Lucie Yang | Erin Chen | Karen Riani | Jessica Zipf | Mariana Shimabukuro | En-Shiun Annie Lee
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)

LangLearn is an open-source framework designed to facilitate autonomous learning of low-resource languages (LRL). By combining a language-agnostic approach with AI-enhanced flashcards, LangLearn empowers users to generate custom flashcards for their vocabulary, while offering structured learning through both pre-curated and self-curated decks. The framework integrates six key components: the word definition, corresponding Hanji characters, romanization with numeric tones, audio pronunciation, a sample sentence, as well as a contextual AI-generated image. LangLearn currently supports English and Taiwanese Hokkien (a variety of Southern Min), with plans to extend support for other dialects. Our preliminary study demonstrates that LangLearn positively empowers users to engage with LRLs using their vocabulary preferences, with a comprehensive user study currently underway. LangLearn’s modular structure enables future expansion, including ASR-based pronunciation practice. The code is available at https://github.com/HokkienTranslation/HokkienTranslation.