Enabling Interactive Transcription in an Indigenous Community

Eric Le Ferrand, Steven Bird, Laurent Besacier


Abstract
We propose a novel transcription workflow which combines spoken term detection and human-in-the-loop, together with a pilot experiment. This work is grounded in an almost zero-resource scenario where only a few terms have so far been identified, involving two endangered languages. We show that in the early stages of transcription, when the available data is insufficient to train a robust ASR system, it is possible to take advantage of the transcription of a small number of isolated words in order to bootstrap the transcription of a speech collection.
Anthology ID:
2020.coling-main.303
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3422–3428
Language:
URL:
https://aclanthology.org/2020.coling-main.303
DOI:
10.18653/v1/2020.coling-main.303
Bibkey:
Cite (ACL):
Eric Le Ferrand, Steven Bird, and Laurent Besacier. 2020. Enabling Interactive Transcription in an Indigenous Community. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3422–3428, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Enabling Interactive Transcription in an Indigenous Community (Le Ferrand et al., COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.coling-main.303.pdf