Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization

Juncheng Hu, Jiming Yu, Rui Song, Kedi Lyu, Yingji Li, Zheli Liu


Abstract
ocial bias in Multimodal Large Language Models (MLLMs) has become an increasingly important concern. Prompt-based approaches offer a lightweight solution for debiasing; however, existing methods rely heavily on handcrafted prompts that are brittle, highly context-sensitive, and difficult to generalize across tasks, bias types, and multimodal settings. In this work, we propose Historical Reflection-Guided Prompt Optimization (HRPO), an adaptive self-debiasing framework for black-box MLLMs that automatically optimizes task-specific debiasing prompts to suppress stereotypical outputs. To mitigate forgetting during prompt optimization, we introduce Historical Contrastive Self-Reflection (HCSR), which performs contrastive reflection over positive and negative optimization histories, enabling the model to retain effective prompts and avoid redundant exploration, thereby improving optimization efficiency. Experiments on three benchmarks involving eight open-source and two closed-source MLLMs, covering ten singular and two intersectional bias types, demonstrate that HRPO achieves strong debiasing performance while offering improved interpretability, generalization, and robustness. Code is available at: https://github.com/liyingji1996/HRPO.
Anthology ID:
2026.findings-acl.1459
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29191–29211
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1459/
DOI:
Bibkey:
Cite (ACL):
Juncheng Hu, Jiming Yu, Rui Song, Kedi Lyu, Yingji Li, and Zheli Liu. 2026. Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization. In Findings of the Association for Computational Linguistics: ACL 2026, pages 29191–29211, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization (Hu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1459.pdf
Checklist:
 2026.findings-acl.1459.checklist.pdf