Abstract
This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech. Spoken Language Identification (LID) is an important step in a multilingual Automated Speech Recognition (ASR) system pipeline. For many low-resource and endangered languages, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results for the language identification task.- Anthology ID:
- 2021.sigtyp-1.12
- Volume:
- Proceedings of the Third Workshop on Computational Typology and Multilingual NLP
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- SIGTYP
- SIG:
- SIGTYP
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 130–135
- Language:
- URL:
- https://aclanthology.org/2021.sigtyp-1.12
- DOI:
- 10.18653/v1/2021.sigtyp-1.12
- Cite (ACL):
- Roman Bedyakin and Nikolay Mikhaylovskiy. 2021. Language ID Prediction from Speech Using Self-Attentive Pooling. In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP, pages 130–135, Online. Association for Computational Linguistics.
- Cite (Informal):
- Language ID Prediction from Speech Using Self-Attentive Pooling (Bedyakin & Mikhaylovskiy, SIGTYP 2021)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/2021.sigtyp-1.12.pdf