DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction

Solee Im, Wonjun Lee, JinMyeong An, Yunsu Kim, Jungseul Ok, Gary Lee


Abstract
We present DeRAGEC, a method for improving Named Entity (NE) correction in Automatic Speech Recognition (ASR) systems. By extending the Retrieval-Augmented Generative Error Correction (RAGEC) framework, DeRAGEC employs synthetic denoising rationales to filter out noisy NE candidates before correction. By leveraging phonetic similarity and augmented definitions, it refines noisy retrieved NEs using in-context learning, requiring no additional training. Experimental results on CommonVoice and STOP datasets show significant improvements in Word Error Rate (WER) and NE hit ratio, outperforming baseline ASR and RAGEC methods. Specifically, we achieved a 28% relative reduction in WER compared to ASR without postprocessing.
Anthology ID:
2025.findings-acl.786
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15181–15193
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.786/
DOI:
10.18653/v1/2025.findings-acl.786
Bibkey:
Cite (ACL):
Solee Im, Wonjun Lee, JinMyeong An, Yunsu Kim, Jungseul Ok, and Gary Lee. 2025. DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction. In Findings of the Association for Computational Linguistics: ACL 2025, pages 15181–15193, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction (Im et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.786.pdf