Generative Annotation for ASR Named Entity Correction
Yuanchang Luo, Daimeng Wei, Shaojun Li, Hengchao Shang, Jiaxin Guo, Zongyao Li, Zhanglin Wu, Xiaoyu Chen, Zhiqiang Rao, Jinlong Yang, Hao Yang
Abstract
End-to-end automatic speech recognition systems often fail to transcribe domain-speciffcnamed entities, causing catastrophic failuresin downstream tasks. Numerous fast and lightweight named entity correction (NEC) models have been proposed in recent years. These models, mainly leveraging phonetic-level edit distance algorithms, have shown impressive performances. However, when theforms of the wrongly-transcribed words(s) and the ground-truth entity are signiffcantly different, these methods often fail to locate the wrongly transcribed words in hypothesis, thus limiting their usage. We propose a novel NEC method that utilizes speech sound features to retrieve candidate entities. With speech sound features and candidate entities, we inovatively design a generative method to annotate entityerrors in ASR transcripts and replace the textwith correct entities. This method is effective inscenarios of word form difference. We test ourmethod using open-source and self-constructed test sets. The results demonstrate that our NEC method can bring signiffcant improvement to entity accuracy. We will open source our self constructed test set and training data.- Anthology ID:
- 2025.emnlp-main.1052
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 20835–20846
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1052/
- DOI:
- Cite (ACL):
- Yuanchang Luo, Daimeng Wei, Shaojun Li, Hengchao Shang, Jiaxin Guo, Zongyao Li, Zhanglin Wu, Xiaoyu Chen, Zhiqiang Rao, Jinlong Yang, and Hao Yang. 2025. Generative Annotation for ASR Named Entity Correction. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 20835–20846, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Generative Annotation for ASR Named Entity Correction (Luo et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1052.pdf