Yeonjea Kim

2025

pdf bib abs
When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models
Cheongwoong Kang | Jongeun Baek | Yeonjea Kim | Jaesik Choi
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing tasks. However, they remain vulnerable to semantic inconsistency, where minor formatting variations result in divergent predictions for semantically equivalent inputs. Our comprehensive evaluation reveals that this brittleness persists even in state-of-the-art models such as GPT-4o, posing a serious challenge to their reliability. Through a mechanistic analysis, we find that semantic-equivalent input changes induce instability in internal representations, ultimately leading to divergent predictions. This reflects a deeper structural issue, where form and meaning are intertwined in the embedding space. We further demonstrate that existing mitigation strategies, including direct fine-tuning on format variations, do not fully address semantic inconsistency, underscoring the difficulty of the problem. Our findings highlight the need for deeper mechanistic understanding to develop targeted methods that improve robustness.

Co-authors

Venues

findings1

Fix author