Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs
Parsa Hejabi, Elnaz Rahmati, Alireza Salkhordeh Ziabari, Morteza Dehghani
Abstract
Large Language Models (LLMs) often produce inconsistent answers when faced with different phrasings of the same prompt. In this paper, we propose Flip-Flop Consistency (F2C), an unsupervised training method that improves robustness to such perturbations. F2C is composed of two key components. The first, Consensus Cross-Entropy (CCE), uses a majority vote across prompt variations to create a hard pseudo-label. The second is a representation alignment loss that pulls lower-confidence and non-majority predictors toward the consensus established by high-confidence, majority-voting variations. We evaluate our method on 11 datasets spanning four NLP tasks, with 4–15 prompt variations per dataset. On average, F2C raises observed agreement by 11.62%, improves mean F1 by 8.94%, and reduces performance variance across formats by 3.29%. In out-of-domain evaluations, F2C generalizes effectively, increasing ̅F1 and agreement while decreasing variance across most source-target pairs. Finally, when trained on only a subset of prompt perturbations and evaluated on held-out formats, F2C consistently improves both performance and agreement while reducing variance. These findings highlight F2C as an effective unsupervised method for enhancing LLM consistency, performance, and generalization under prompt perturbations.- Anthology ID:
- 2026.acl-long.71
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1571–1587
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.71/
- DOI:
- Cite (ACL):
- Parsa Hejabi, Elnaz Rahmati, Alireza Salkhordeh Ziabari, and Morteza Dehghani. 2026. Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1571–1587, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs (Hejabi et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.71.pdf