Scaling Cultural Resources for Improving Generative Models

Hayk Stepanyan, Aishwarya Verma, Andrew Zaldivar, Rutledge Chin Feman, Erin MacMurray van Liemt, Charu Kalia, Vinodkumar Prabhakaran, Sunipa Dev


Abstract
Generative models are known to have reduced performance in different global cultural contexts and languages. While continual data updates have been known to be conducted to improve overall model performance, bolstering and evaluating this cross-cultural competence of generative AI models requires data resources to be intentionally expanded to include global contexts and languages. In this work, we construct a multi-pronged pipeline to collect and contribute culturally salient, multilingual data. We posit that such data can assess the state of the global applicability of our models and thus, in turn, help identify and improve upon cross-cultural gaps.
Anthology ID:
2026.findings-eacl.352
Volume:
Findings of the Association for Computational Linguistics: EACL 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6695–6709
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.352/
DOI:
Bibkey:
Cite (ACL):
Hayk Stepanyan, Aishwarya Verma, Andrew Zaldivar, Rutledge Chin Feman, Erin MacMurray van Liemt, Charu Kalia, Vinodkumar Prabhakaran, and Sunipa Dev. 2026. Scaling Cultural Resources for Improving Generative Models. In Findings of the Association for Computational Linguistics: EACL 2026, pages 6695–6709, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Scaling Cultural Resources for Improving Generative Models (Stepanyan et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.352.pdf
Checklist:
 2026.findings-eacl.352.checklist.pdf