Graph-Based Phonetic Error Correction of Noisy ASR

Pratik Rakesh Singh, Mohammadi Zaki, Aneesh Mukkamla, Pankaj Wasnik


Abstract
Automatic speech recognition (ASR) systems, despite low overall word error rates, produce residual lexical errors that disproportionately affect semantically critical tokens such as named entities, negations, and sentiment-bearing words. These errors are often structured, arising from phonetic similarity rather than random noise, making naive token-level correction insufficient.We propose a structured ASR correction framework, that we call G-SPIN, that combines phonetic graph modeling with contextual language understanding. A graph neural network (GNN) first constructs acoustically plausible candidate neighborhoods for flagged tokens, explicitly restricting the correction search space to phonetic alternatives. A masked language model (MLM) then provides local contextual scoring, and an instruction-tuned large language model (LLM) performs final context-aware re-ranking over this compact candidate set. By decoupling structured phonetic reasoning from contextual semantic selection, our method avoids unconstrained generation while improving correction accuracy. The framework is lightweight, modular, and operates entirely at inference time.
Anthology ID:
2026.acl-industry.151
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yunyao Li, Georg Rehm, Mei Tu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2261–2270
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.151/
DOI:
Bibkey:
Cite (ACL):
Pratik Rakesh Singh, Mohammadi Zaki, Aneesh Mukkamla, and Pankaj Wasnik. 2026. Graph-Based Phonetic Error Correction of Noisy ASR. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 2261–2270, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Graph-Based Phonetic Error Correction of Noisy ASR (Singh et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.151.pdf