When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models

Cheongwoong Kang, Jongeun Baek, Yeonjea Kim, Jaesik Choi


Abstract
Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing tasks. However, they remain vulnerable to semantic inconsistency, where minor formatting variations result in divergent predictions for semantically equivalent inputs. Our comprehensive evaluation reveals that this brittleness persists even in state-of-the-art models such as GPT-4o, posing a serious challenge to their reliability. Through a mechanistic analysis, we find that semantic-equivalent input changes induce instability in internal representations, ultimately leading to divergent predictions. This reflects a deeper structural issue, where form and meaning are intertwined in the embedding space. We further demonstrate that existing mitigation strategies, including direct fine-tuning on format variations, do not fully address semantic inconsistency, underscoring the difficulty of the problem. Our findings highlight the need for deeper mechanistic understanding to develop targeted methods that improve robustness.
Anthology ID:
2025.findings-emnlp.143
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2647–2667
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.143/
DOI:
10.18653/v1/2025.findings-emnlp.143
Bibkey:
Cite (ACL):
Cheongwoong Kang, Jongeun Baek, Yeonjea Kim, and Jaesik Choi. 2025. When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 2647–2667, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models (Kang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.143.pdf
Checklist:
 2025.findings-emnlp.143.checklist.pdf