Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization
Juncheng Hu, Jiming Yu, Rui Song, Kedi Lyu, Yingji Li, Zheli Liu
Abstract
ocial bias in Multimodal Large Language Models (MLLMs) has become an increasingly important concern. Prompt-based approaches offer a lightweight solution for debiasing; however, existing methods rely heavily on handcrafted prompts that are brittle, highly context-sensitive, and difficult to generalize across tasks, bias types, and multimodal settings. In this work, we propose Historical Reflection-Guided Prompt Optimization (HRPO), an adaptive self-debiasing framework for black-box MLLMs that automatically optimizes task-specific debiasing prompts to suppress stereotypical outputs. To mitigate forgetting during prompt optimization, we introduce Historical Contrastive Self-Reflection (HCSR), which performs contrastive reflection over positive and negative optimization histories, enabling the model to retain effective prompts and avoid redundant exploration, thereby improving optimization efficiency. Experiments on three benchmarks involving eight open-source and two closed-source MLLMs, covering ten singular and two intersectional bias types, demonstrate that HRPO achieves strong debiasing performance while offering improved interpretability, generalization, and robustness. Code is available at: https://github.com/liyingji1996/HRPO.- Anthology ID:
- 2026.findings-acl.1459
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 29191–29211
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1459/
- DOI:
- Cite (ACL):
- Juncheng Hu, Jiming Yu, Rui Song, Kedi Lyu, Yingji Li, and Zheli Liu. 2026. Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization. In Findings of the Association for Computational Linguistics: ACL 2026, pages 29191–29211, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Distilling the Essence, Discarding the Dross: Improving Fairness in Multimodal Large Language Models via Historical Reflection-Guided Prompt Optimization (Hu et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1459.pdf