SRS-Stories: Vocabulary-constrained multilingual story generation for language learning

Wiktor Kamzela, Mateusz Lango, Ondrej Dusek


Abstract
In this paper, we use large language models to generate personalized stories for language learners, using only the vocabulary they know.The generated texts are specifically written to teach the user new vocabulary by simply reading stories where it appears in context, while at the same time seamlessly reviewing recently learned vocabulary. The generated stories are enjoyable to read and the vocabulary reviewing/learning is optimized by a Spaced Repetition System.The experiments are conducted in three languages: English, Chinese and Polish, evaluating three story generation methods and three strategies for enforcing lexical constraints. The results show that the generated stories are more grammatical, coherent, and provide better examples of word usage than texts generated by the standard constrained beam search approach.
Anthology ID:
2025.emnlp-industry.44
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2025
Address:
Suzhou (China)
Editors:
Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
630–645
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.44/
DOI:
Bibkey:
Cite (ACL):
Wiktor Kamzela, Mateusz Lango, and Ondrej Dusek. 2025. SRS-Stories: Vocabulary-constrained multilingual story generation for language learning. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 630–645, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):
SRS-Stories: Vocabulary-constrained multilingual story generation for language learning (Kamzela et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.44.pdf