Abstract
We present, to our knowledge, the first ever published morphological analyser and generator for Sakha, a marginalised language of Siberia. The transducer, developed using HFST, has coverage of solidly above 90%, and high precision. In the development of the analyser, we have expanded linguistic knowledge about Sakha, and developed strategies for complex grammatical patterns. The transducer is already being used in downstream tasks, including computer assisted language learning applications for linguistic maintenance and computational linguistic shared tasks.- Anthology ID:
- 2022.lrec-1.550
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 5137–5142
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.550
- DOI:
- Cite (ACL):
- Sardana Ivanova, Jonathan Washington, and Francis Tyers. 2022. A Free/Open-Source Morphological Analyser and Generator for Sakha. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5137–5142, Marseille, France. European Language Resources Association.
- Cite (Informal):
- A Free/Open-Source Morphological Analyser and Generator for Sakha (Ivanova et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.lrec-1.550.pdf
- Code
- apertium/apertium-sah