Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification

Xiaoyue Wang (王笑月); Xin Liu

Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification

Abstract

Large Language Models (LLMs) exhibit impressive capabilities in In-Context Learning (ICL) but are prone to label bias—an undesirable tendency to favor certain answers. Existing calibration methods mitigate bias by leveraging in-domain data, yet such data is often unavailable in real-world scenarios. To address this limitation, we propose SDC (Synthetic Data Calibration), a simple-yet-effective approach that generates synthetic in-domain data from a few in-context demonstrations and utilizes it for calibration. By approximating the benefits of real in-domain data, SDC effectively reduces label bias without requiring access to actual domain-specific inputs. Experimental evaluations on 279 classification and multiple-choice tasks from the Super-NaturalInstructions benchmark. The results show that SDC significantly reduces label bias, achieving an average Bias Score reduction of 57.5%, and outperforming all competitive baselines. Moreover, when combined with Leave-One-Out Calibration (LOOC), further improves performance, underscoring its effectiveness and generalizability in enhancing the reliability of LLMs.

Anthology ID:: 2025.findings-acl.1307
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:: Findings | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25500–25506
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1307/
DOI:
Bibkey:
Cite (ACL):: Xiaoyue Wang and Xin Liu. 2025. Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification. In Findings of the Association for Computational Linguistics: ACL 2025, pages 25500–25506, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification (Wang & Liu, Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1307.pdf

PDF Cite Search Fix data