Iain Weissburg

2025

pdf bib abs
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
Iain Weissburg | Sathvika Anand | Sharon Levy | Haewon Jeong
Findings of the Association for Computational Linguistics: NAACL 2025

With the increasing adoption of large language models (LLMs) in education, concerns about inherent biases in these models have gained prominence. We evaluate LLMs for bias in the personalized educational setting, specifically focusing on the models’ roles as “teachers.” We reveal significant biases in how models generate and select educational content tailored to different demographic groups, including race, ethnicity, sex, gender, disability status, income, and national origin. We introduce and apply two bias score metrics—Mean Absolute Bias (MAB) and Maximum Difference Bias (MDB)—to analyze 9 open and closed state-of-the-art LLMs. Our experiments, which utilize over 17,000 educational explanations across multiple difficulty levels and topics, uncover that models potentially harm student learning by both perpetuating harmful stereotypes and reversing them. We find that bias is similar for all frontier models, with the highest MAB along income levels while MDB is highest relative to both income and disability status. For both metrics, we find the lowest bias exists for sex/gender and race/ethnicity.

pdf bib abs
Human Bias in the Face of AI: Examining Human Judgment Against Text Labeled as AI Generated
Tiffany Zhu | Iain Weissburg | Kexun Zhang | William Yang Wang
Findings of the Association for Computational Linguistics: ACL 2025

As Al advances in text generation, human trust in Al generated content remains constrained by biases that go beyond concerns of accuracy. This study explores how bias shapes the perception of AI versus human generated content. Through three experiments involving text rephrasing, news article summarization, and persuasive writing, we investigated how human raters respond to labeled and unlabeled content. While the raters could not differentiate the two types of texts in the blind test, they overwhelmingly favored content labeled as “Human Generated,” over those labeled “AI Generated,” by a preference score of over 30%. We observed the same pattern even when the labels were deliberately swapped. This human bias against AI has broader societal and cognitive implications, as it undervalues AI performance. This study highlights the limitations of human judgment in interacting with AI and offers a foundation for improving human-AI collaboration, especially in creative fields.

Co-authors

Tiffany Zhu 1

Venues

findings2

Fix author