Automatic Speech Recognition and Query By Example for Creole Languages Documentation
Cécile Macaire, Didier Schwab, Benjamin Lecouteux, Emmanuel Schang
Abstract
We investigate the exploitation of self-supervised models for two Creole languages with few resources: Gwadloupéyen and Morisien. Automatic language processing tools are almost non-existent for these two languages. We propose to use about one hour of annotated data to design an automatic speech recognition system for each language. We evaluate how much data is needed to obtain a query-by-example system that is usable by linguists. Moreover, our experiments show that multilingual self-supervised models are not necessarily the most efficient for Creole languages.- Anthology ID:
- 2022.findings-acl.197
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2022
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2512–2520
- Language:
- URL:
- https://aclanthology.org/2022.findings-acl.197
- DOI:
- 10.18653/v1/2022.findings-acl.197
- Cite (ACL):
- Cécile Macaire, Didier Schwab, Benjamin Lecouteux, and Emmanuel Schang. 2022. Automatic Speech Recognition and Query By Example for Creole Languages Documentation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2512–2520, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Automatic Speech Recognition and Query By Example for Creole Languages Documentation (Macaire et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-acl.197.pdf
- Code
- macairececile/asr-qbe-creole