An Experiment on Speech-to-Text Translation Systems for Manipuri to English on Low Resource Setting

Loitongbam Sanayai Meetei, Laishram Rahul, Alok Singh, Salam Michael Singh, Thoudam Doren Singh, Sivaji Bandyopadhyay


Abstract
In this paper, we report the experimental findings of building Speech-to-Text translation systems for Manipuri-English on low resource setting which is first of its kind in this language pair. For this purpose, a new dataset consisting of a Manipuri-English parallel corpus along with the corresponding audio version of the Manipuri text is built. Based on this dataset, a benchmark evaluation is reported for the Manipuri-English Speech-to-Text translation using two approaches: 1) a pipeline model consisting of ASR (Automatic Speech Recognition) and Machine translation, and 2) an end-to-end Speech-to-Text translation. Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) and Time delay neural network (TDNN) Acoustic models are used to build two different pipeline systems using a shared MT system. Experimental result shows that the TDNN model outperforms GMM-HMM model significantly by a margin of 2.53% WER. However, their evaluation of Speech-to-Text translation differs by a small margin of 0.1 BLEU. Both the pipeline translation models outperform the end-to-end translation model by a margin of 2.6 BLEU score.
Anthology ID:
2021.icon-main.8
Volume:
Proceedings of the 18th International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2021
Address:
National Institute of Technology Silchar, Silchar, India
Editors:
Sivaji Bandyopadhyay, Sobha Lalitha Devi, Pushpak Bhattacharyya
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
54–63
Language:
URL:
https://aclanthology.org/2021.icon-main.8
DOI:
Bibkey:
Cite (ACL):
Loitongbam Sanayai Meetei, Laishram Rahul, Alok Singh, Salam Michael Singh, Thoudam Doren Singh, and Sivaji Bandyopadhyay. 2021. An Experiment on Speech-to-Text Translation Systems for Manipuri to English on Low Resource Setting. In Proceedings of the 18th International Conference on Natural Language Processing (ICON), pages 54–63, National Institute of Technology Silchar, Silchar, India. NLP Association of India (NLPAI).
Cite (Informal):
An Experiment on Speech-to-Text Translation Systems for Manipuri to English on Low Resource Setting (Sanayai Meetei et al., ICON 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2021.icon-main.8.pdf