PAR: Training-Free Positional Perturbation and Attention Recycling for Faithful OCR

Yao Yao, Manwen Liao, Weitian Zhang, Zuchao Li, Hai Zhao


Abstract
In high-precision scenarios, vision language models suffer from Linguistic Priors Hallucination. When processing familiar text, models tend to over-rely on internal parametric knowledge, effectively "reciting" the content rather than "reading" the image. In this paper, we first systematically investigate this phenomenon by constructing the GlitchText Probing Dataset. We discover that the model’s reliance on visual grounding diminishes significantly as the generation length increases. To mitigate this, we propose PAR (Positional Perturbation and Attention Recycling), a training-free, inference-time intervention framework. PAR consists of two parts: (1) Positional Perturbation (PP) injects structured phase noise into the rotary positional embeddings; (2) Foveal Attention Recycling (FAR) detects over-confident linguistic priors and dynamically redistributes attention mass back to important visual regions. Extensive experiments across state-of-the-art models, demonstrate that PAR significantly reduces hallucination rates (reducing CER by 12%), particularly in long-context scenarios, while maintaining robust generalization on standard benchmarks.
Anthology ID:
2026.acl-long.1065
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23258–23273
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1065/
DOI:
Bibkey:
Cite (ACL):
Yao Yao, Manwen Liao, Weitian Zhang, Zuchao Li, and Hai Zhao. 2026. PAR: Training-Free Positional Perturbation and Attention Recycling for Faithful OCR. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23258–23273, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
PAR: Training-Free Positional Perturbation and Attention Recycling for Faithful OCR (Yao et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1065.pdf
Checklist:
 2026.acl-long.1065.checklist.pdf