Towards a First Automatic Unsupervised Morphological Segmentation for Inuinnaqtun

Ngoc Tan Le, Fatiha Sadat


Abstract
Low-resource polysynthetic languages pose many challenges in NLP tasks, such as morphological analysis and Machine Translation, due to available resources and tools, and the morphologically complex languages. This research focuses on the morphological segmentation while adapting an unsupervised approach based on Adaptor Grammars in low-resource setting. Experiments and evaluations on Inuinnaqtun, one of Inuit language family in Northern Canada, considered a language that will be extinct in less than two generations, have shown promising results.
Anthology ID:
2021.americasnlp-1.17
Volume:
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
Month:
June
Year:
2021
Address:
Online
Editors:
Manuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann
Venue:
AmericasNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
159–162
Language:
URL:
https://aclanthology.org/2021.americasnlp-1.17
DOI:
10.18653/v1/2021.americasnlp-1.17
Bibkey:
Cite (ACL):
Ngoc Tan Le and Fatiha Sadat. 2021. Towards a First Automatic Unsupervised Morphological Segmentation for Inuinnaqtun. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, pages 159–162, Online. Association for Computational Linguistics.
Cite (Informal):
Towards a First Automatic Unsupervised Morphological Segmentation for Inuinnaqtun (Le & Sadat, AmericasNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2021.americasnlp-1.17.pdf