Abstract
Armenian is a traditionally under-resourced language, which has seen a recent uptick in interest in the development of its tools and presence in the digital domain. Some of this recent interest has centred around the development of Automatic Speech Recognition (ASR) technologies. However, the language boasts two standard variants which diverge on multiple typological and structural levels. In this work, we examine some of the available bodies of data for ASR construction, present the challenges in the processing of these data and propose a methodology going forward.- Anthology ID:
- 2022.digitam-1.6
- Volume:
- Proceedings of the Workshop on Processing Language Variation: Digital Armenian (DigitAm) within the 13th Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Victoria Khurshudyan, Nadi Tomeh, Damien Nouvel, Anaid Donabedian, Chahan Vidal-Gorene
- Venue:
- DigitAm
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 38–42
- Language:
- URL:
- https://aclanthology.org/2022.digitam-1.6
- DOI:
- Cite (ACL):
- Samuel Chakmakjian and Ilaine Wang. 2022. Towards a Unified ASR System for the Armenian Standards. In Proceedings of the Workshop on Processing Language Variation: Digital Armenian (DigitAm) within the 13th Language Resources and Evaluation Conference, pages 38–42, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Towards a Unified ASR System for the Armenian Standards (Chakmakjian & Wang, DigitAm 2022)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2022.digitam-1.6.pdf