Chaeeun Joy Lee


2025

pdf bib
(Dis)improved?! How Simplified Language Affects Large Language Model Performance across Languages
Miriam Anschütz | Anastasiya Damaratskaya | Chaeeun Joy Lee | Arthur Schmalz | Edoardo Mosca | Georg Groh
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)

Simplified language enhances the accessibility and human understanding of texts. However, whether it also benefits large language models (LLMs) remains underexplored. This paper extensively studies whether LLM performance improves on simplified data compared to its original counterpart. Our experiments span six datasets and nine automatic simplification systems across three languages. We show that English models, including GPT-4o-mini, show a weak generalization and exhibit a significant performance drop on simplified data. This introduces an intriguing paradox: simplified data is helpful for humans but not for LLMs. At the same time, the performance in non-English languages sometimes improves, depending on the task and quality of the simplifier. Our findings offer a comprehensive view of the impact of simplified language on LLM performance and uncover severe implications for people depending on simple language.