UMAD: Enhancing LLM Debiasing via Multi-Agent Debate and Token-Level Bias Interpretation

Hanwen Gu; JieMa JieMa; Ying Qin; Ling Hu

UMAD: Enhancing LLM Debiasing via Multi-Agent Debate and Token-Level Bias Interpretation

Hanwen Gu, JieMa JieMa, Ying Qin, Ling Hu

Abstract

"Textual data often contain biases that compromise fairness in AI systems, particularly in sensitive areas such as gender, race, and politics. While large language models (LLMs) have shown success across various tasks, they still face limitations due to inherent biases within the model sand restrictive safety policies that hinder direct bias mitigation. To overcome these challenges,we propose UMAD (Unsupervised Multi-Agent Debate), a novel framework that leverages aMulti-Agent Debate mechanism alongside Best-Worst Scaling (BWS) to foster more effective discussions among LLMs, facilitating the identification of biases. By combining this with gradient-based interpretation techniques, UMAD extracts token-level bias insights, which are then integrated into models using in-context learning. This enhances the debiasing performance, as shown by our experiments across three bias categories—gender, religion, and politics—using five different LLMs. Our approach demonstrates significant improvements in metrics, with large models matching or even surpassing GPT-4 in Style Accuracy (STA). We release our code at:https://github.com/Couen/UMAD.git."

Anthology ID:: 2025.ccl-1.81
Volume:: Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:: August
Year:: 2025
Address:: Jinan, China
Editors:: Maosong Sun, Peiyong Duan, Zhiyuan Liu, Ruifeng Xu, Weiwei Sun
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 1078–1094
Language:
URL:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.81/
DOI:
Bibkey:
Cite (ACL):: Hanwen Gu, JieMa JieMa, Ying Qin, and Ling Hu. 2025. UMAD: Enhancing LLM Debiasing via Multi-Agent Debate and Token-Level Bias Interpretation. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 1078–1094, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):: UMAD: Enhancing LLM Debiasing via Multi-Agent Debate and Token-Level Bias Interpretation (Gu et al., CCL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.81.pdf

PDF Cite Search Fix data