Principled Self-Correction in Discrete Diffusion: A UCB-Guided Framework for Text Generation

Masaki Asada; Makoto Miwa

Principled Self-Correction in Discrete Diffusion: A UCB-Guided Framework for Text Generation

Abstract

Inspired by their success in image synthesis, diffusion models offer a flexible, iterative alternative to rigid left-to-right text generation. However, a fundamental training-inference discrepancy hinders their performance: models are trained on corrupted ground-truth tokens, but at inference time they must denoise inputs corrupted from their own predictions. To bridge this gap, we propose a unified framework. First, Deeper Self-Prediction (DSP) is a multi-step training objective that teaches robust self-correction by forcing the model to denoise its own intermediate outputs. Second, UCB-guided Decoding is a principled inference algorithm that frames token re-masking as a multi-armed bandit problem, using the Upper Confidence Bound (UCB) to balance exploration and exploitation. Experiments on text generation tasks demonstrate consistent improvements over existing diffusion baselines. The framework achieves higher faithfulness and coherence according to both automatic metrics and LLM-as-a-Judge evaluations.

Anthology ID:: 2026.eacl-long.314
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6678–6692
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.314/
DOI:
Bibkey:
Cite (ACL):: Masaki Asada and Makoto Miwa. 2026. Principled Self-Correction in Discrete Diffusion: A UCB-Guided Framework for Text Generation. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6678–6692, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Principled Self-Correction in Discrete Diffusion: A UCB-Guided Framework for Text Generation (Asada & Miwa, EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.314.pdf

PDF Cite Search Fix data