Sofía Martinelli
Also published as: Sof{\'\i}a Martinelli
2026
Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)
Guido Ivetta | Pietro Palombini | Sof{\'\i}a Martinelli | Marcos J Gomez | M Emilia Echeveste | Sunipa Dev | Vinodkumar Prabhakaran | Luciana Benotti
Findings of the Association for Computational Linguistics: ACL 2026
Guido Ivetta | Pietro Palombini | Sof{\'\i}a Martinelli | Marcos J Gomez | M Emilia Echeveste | Sunipa Dev | Vinodkumar Prabhakaran | Luciana Benotti
Findings of the Association for Computational Linguistics: ACL 2026
The evaluation of societal biases in NLP models is critically hindered by a geo-cultural gap. This leaves regions such as Latin America severely underserved, making it impossible to adequately assess or mitigate the perpetuation of harmful regional stereotypes in language technologies. This paper presents LACES, a stereotype association dataset, for 15 Latin American countries. This dataset includes 4,789 stereotype associations[The de-identified dataset can be accessed via GitHub], manually created and annotated by 83 participants. The dataset was developed through targeted community partnerships across Latin America. Additionally, in this paper, we propose a novel adaptive data collection methodology that uniquely integrates the sourcing of new stereotype entries and the validation of existing data within a single, unified workflow. This approach results in a resource with more unique stereotypes than previous static collection methods, enabling a more efficient stereotype collection. The paper further supports the quality of LACES by demonstrating reduced efficacy of debiasing methods on this dataset in comparison to existing popular stereotype benchmarks.Content Warning: This research involves the study of social biases. Consequently, the paper contains examples of discriminatory language and stereotypes that may be sensitive or upsetting to readers. These examples are included for the purpose of scientific analysis and do not reflect the views of the authors.
2025
HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America
Guido Ivetta | Marcos J Gomez | Sofía Martinelli | Pietro Palombini | M Emilia Echeveste | Nair Carolina Mazzeo | Beatriz Busaniche | Luciana Benotti
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Guido Ivetta | Marcos J Gomez | Sofía Martinelli | Pietro Palombini | M Emilia Echeveste | Nair Carolina Mazzeo | Beatriz Busaniche | Luciana Benotti
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Most resources for evaluating social biases in Large Language Models are developed without co-design from the communities affected by these biases, and rarely involve participatory approaches. We introduce HESEIA, a dataset of 46,499 sentences created in a professional development course. The course involved 370 high-school teachers and 5,370 students from 189 Latin-American schools. Unlike existing benchmarks, HESEIA captures intersectional biases across multiple demographic axes and school subjects. It reflects local contexts through the lived experience and pedagogical expertise of educators. Teachers used minimal pairs to create sentences that express stereotypes relevant to their school subjects and communities. We show the dataset diversity in term of demographic axes represented and also in terms of the knowledge areas included. We demonstrate that the dataset contains more stereotypes unrecognized by current LLMs than previous datasets. HESEIA is available to support bias assessments grounded in educational communities.
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Emilio Villa-Cueva | Sholpan Bolatzhanova | Diana Turmakhan | Kareem Elzeky | Henok Biadglign Ademtew | Alham Fikri Aji | Vladimir Araujo | Israel Abebe Azime | Jinheon Baek | Frederico Belcavello | Fermin Cristobal | Jan Christian Blaise Cruz | Mary Dabre | Raj Dabre | Toqeer Ehsan | Naome A Etori | Fauzan Farooqui | Jiahui Geng | Guido Ivetta | Thanmay Jayakumar | Soyeong Jeong | Zheng Wei Lim | Aishik Mandal | Sofía Martinelli | Mihail Minkov Mihaylov | Daniil Orel | Aniket Pramanick | Sukannya Purkayastha | Israfel Salazar | Haiyue Song | Tiago Timponi Torrent | Debela Desalegn Yadeta | Injy Hamed | Atnafu Lambebo Tonja | Thamar Solorio
Findings of the Association for Computational Linguistics: EMNLP 2025
Emilio Villa-Cueva | Sholpan Bolatzhanova | Diana Turmakhan | Kareem Elzeky | Henok Biadglign Ademtew | Alham Fikri Aji | Vladimir Araujo | Israel Abebe Azime | Jinheon Baek | Frederico Belcavello | Fermin Cristobal | Jan Christian Blaise Cruz | Mary Dabre | Raj Dabre | Toqeer Ehsan | Naome A Etori | Fauzan Farooqui | Jiahui Geng | Guido Ivetta | Thanmay Jayakumar | Soyeong Jeong | Zheng Wei Lim | Aishik Mandal | Sofía Martinelli | Mihail Minkov Mihaylov | Daniil Orel | Aniket Pramanick | Sukannya Purkayastha | Israfel Salazar | Haiyue Song | Tiago Timponi Torrent | Debela Desalegn Yadeta | Injy Hamed | Atnafu Lambebo Tonja | Thamar Solorio
Findings of the Association for Computational Linguistics: EMNLP 2025
Translating cultural content poses challenges for machine translation systems due to the differences in conceptualizations between cultures, where language alone may fail to convey sufficient context to capture region-specific meanings. In this work, we investigate whether images can act as cultural context in multimodal translation. We introduce CaMMT, a human-curated benchmark of over 5,800 triples of images along with parallel captions in English and regional languages. Using this dataset, we evaluate five Vision Language Models (VLMs) in text-only and text+image settings. Through automatic and human evaluations, we find that visual context generally improves translation quality, especially in handling Culturally-Specific Items (CSIs), disambiguation, and correct gender marking. By releasing CaMMT, our objective is to support broader efforts to build and evaluate multimodal translation systems that are better aligned with cultural nuance and regional variations.
Search
Fix author
Co-authors
- Guido Ivetta 3
- Luciana Benotti 2
- M Emilia Echeveste 2
- Marcos J Gomez 2
- Pietro Palombini 2
- Henok Biadglign Ademtew 1
- Alham Fikri Aji 1
- Vladimir Araujo 1
- Israel Abebe Azime 1
- Jinheon Baek 1
- Frederico Belcavello 1
- Sholpan Bolatzhanova 1
- Beatriz Busaniche 1
- Fermin Cristobal 1
- Jan Christian Blaise Cruz 1
- Mary Dabre 1
- Raj Dabre 1
- Sunipa Dev 1
- Toqeer Ehsan 1
- Kareem Elzeky 1
- Naome A. Etori 1
- Fauzan Farooqui 1
- Jiahui Geng 1
- Injy Hamed 1
- Thanmay Jayakumar 1
- Soyeong Jeong 1
- Zheng Wei Lim 1
- Aishik Mandal 1
- Nair Carolina Mazzeo 1
- Mihail Minkov Mihaylov 1
- Daniil Orel 1
- Vinodkumar Prabhakaran 1
- Aniket Pramanick 1
- Sukannya Purkayastha 1
- Israfel Salazar 1
- Thamar Solorio 1
- Haiyue Song 1
- Atnafu Lambebo Tonja 1
- Tiago Timponi Torrent 1
- Diana Turmakhan 1
- Emilio Villa-Cueva 1
- Debela Desalegn Yadeta 1