多特征融合的越英端到端语音翻译方法(A Vietnamese-English end-to-end speech translation method based on multi-feature fusion)
Houli Ma (马候丽), Ling Dong (董凌), Wenjun Wang (王文君), Jian Wang (王剑), Shengxiang Gao (高盛祥), Zhengtao Yu (余正涛)
Abstract
“语音翻译的编码器需要同时编码语音中的声学和语义信息,单一的Fbank或Wav2vec2语音特征表征能力存在不足。本文通过分析人工的Fbank特征与自监督的Wav2vec2特征间的差异性,提出基于交叉注意力机制的声学特征融合方法,并探究了不同的自监督特征和融合方式,加强模型对语音中声学和语义信息的学习。结合越南语语音特点,以Fbank特征为主、Pitch特征为辅混合编码Fbank表征,构建多特征融合的越-英语音翻译模型。实验表明,使用多特征的语音翻译模型相比单特征翻译效果更优,与简单的特征拼接方法相比更有效,所提的多特征融合方法在越-英语音翻译任务上提升了1.97个BLEU值。”- Anthology ID:
- 2022.ccl-1.27
- Volume:
- Proceedings of the 21st Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Nanchang, China
- Editors:
- Maosong Sun (孙茂松), Yang Liu (刘洋), Wanxiang Che (车万翔), Yang Feng (冯洋), Xipeng Qiu (邱锡鹏), Gaoqi Rao (饶高琦), Yubo Chen (陈玉博)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 293–304
- Language:
- Chinese
- URL:
- https://aclanthology.org/2022.ccl-1.27
- DOI:
- Cite (ACL):
- Houli Ma, Ling Dong, Wenjun Wang, Jian Wang, Shengxiang Gao, and Zhengtao Yu. 2022. 多特征融合的越英端到端语音翻译方法(A Vietnamese-English end-to-end speech translation method based on multi-feature fusion). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 293–304, Nanchang, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 多特征融合的越英端到端语音翻译方法(A Vietnamese-English end-to-end speech translation method based on multi-feature fusion) (Ma et al., CCL 2022)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2022.ccl-1.27.pdf