Abstract
Automatic Speech Recognition (ASR) is an efficient and widely used input method that transcribes speech signals into text. As the errors introduced by ASR systems will impair the performance of downstream tasks, we introduce a post-processing error correction method, PhVEC, to correct errors in text space. For the errors in ASR result, existing works mainly focus on fixed-length corrections, modifying each wrong token to a correct one (one-to-one correction), but rarely consider the variable-length correction (one-to-many or many-to-one correction). In this paper, we propose an efficient non-autoregressive (NAR) method for Chinese ASR error correction for both cases. Instead of conventionally predicting the sentence length in NAR methods, we propose a novel approach that uses phonological tokens to extend the source sentence for variable-length correction, enabling our model to generate phonetically similar corrections. Experimental results on datasets of different domains show that our method achieves significant improvement in word error rate reduction and speeds up the inference by 6.2 times compared with the autoregressive model.- Anthology ID:
- 2022.naacl-main.432
- Volume:
- Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5907–5917
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.naacl-main.432/
- DOI:
- 10.18653/v1/2022.naacl-main.432
- Cite (ACL):
- Zheng Fang, Ruiqing Zhang, Zhongjun He, Hua Wu, and Yanan Cao. 2022. Non-Autoregressive Chinese ASR Error Correction with Phonological Training. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5907–5917, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Non-Autoregressive Chinese ASR Error Correction with Phonological Training (Fang et al., NAACL 2022)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.naacl-main.432.pdf
- Data
- AISHELL-1