From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation

Minh Duc Bui, Xenia Heilmann, Mattia Cerrato, Manuel Mager, Katharina Von Der Wense


Abstract
Prior work evaluates code generation bias primarily through simple conditional statements, which represent only a narrow slice of real-world programming and reveal solely overt, explicitly encoded bias. We demonstrate that this approach dramatically underestimates real-world bias by examining a more realistic task: generating machine learning (ML) pipelines. Testing both code-specialized and general-instruction large language models, we find that ML pipelines exhibit substantially greater bias than simple conditionals across all conditions: standard generation, with varying prompt-based mitigation strategies, varying numbers of attributes, and different ML pipeline difficulty levels. Even attribute selection alone, the simplest pipeline difficulty, shows higher bias compared to conditionals, demonstrating that ML pipelines inherently amplify bias beyond what isolated conditionals reveal. Critically, we uncover a stark asymmetry: models maintain equivalent bias detection performance on both simple conditionals and ML pipelines, revealing that models recognize bias equally well in both contexts yet generate significantly more biased code in ML pipelines. These findings challenge simple conditionals as valid proxies for bias evaluation and suggest current benchmarks mischaracterize model safety in practical deployment contexts.
Anthology ID:
2026.findings-acl.193
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3958–3972
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.193/
DOI:
Bibkey:
Cite (ACL):
Minh Duc Bui, Xenia Heilmann, Mattia Cerrato, Manuel Mager, and Katharina Von Der Wense. 2026. From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3958–3972, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation (Bui et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.193.pdf
Checklist:
 2026.findings-acl.193.checklist.pdf