Culturally-Nuanced Story Generation for Reasoning in Low-Resource Languages: The Case of Javanese and Sundanese

Salsabila Zahirah Pranida; Rifo Ahmad Genadi; Fajri Koto

Culturally-Nuanced Story Generation for Reasoning in Low-Resource Languages: The Case of Javanese and Sundanese

Salsabila Zahirah Pranida, Rifo Ahmad Genadi, Fajri Koto

Abstract

Culturally grounded commonsense reasoning is underexplored in low-resource languages due to scarce data and costly native annotation. We test whether large language models (LLMs) can generate culturally nuanced narratives for such settings. Focusing on Javanese and Sundanese, we compare three data creation strategies: (1) LLM-assisted stories prompted with cultural cues, (2) machine translation from Indonesian benchmarks, and (3) native-written stories. Human evaluation finds LLM stories match natives on cultural fidelity but lag in coherence and correctness. We fine-tune models on each dataset and evaluate on a human-authored test set for classification and generation. LLM-generated data yields higher downstream performance than machine-translated and Indonesian human-authored training data. We release a high-quality benchmark of culturally grounded commonsense stories in Javanese and Sundanese to support future work.

Anthology ID:: 2025.mrl-main.25
Volume:: Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:: November
Year:: 2025
Address:: Suzhuo, China
Editors:: David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:: MRL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 369–384
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.25/
DOI:
Bibkey:
Cite (ACL):: Salsabila Zahirah Pranida, Rifo Ahmad Genadi, and Fajri Koto. 2025. Culturally-Nuanced Story Generation for Reasoning in Low-Resource Languages: The Case of Javanese and Sundanese. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 369–384, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):: Culturally-Nuanced Story Generation for Reasoning in Low-Resource Languages: The Case of Javanese and Sundanese (Pranida et al., MRL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.25.pdf

PDF Cite Search Fix data