Dialects Identification of Armenian Language

Karen Avetisyan


Abstract
The Armenian language has many dialects that differ from each other syntactically, morphologically, and phonetically. In this work, we implement and evaluate models that determine the dialect of a given passage of text. The proposed models are evaluated for the three major variations of the Armenian language: Eastern, Western, and Classical. Previously, there were no instruments of dialect identification in the Armenian language. The paper presents three approaches: a statistical which relies on a stop words dictionary, a modified statistical one with a dictionary of most frequently encountered words, and the third one that is based on Facebook’s fastText language identification neural network model. Two types of neural network models were trained, one with the usage of pre-trained word embeddings and the other without. Approaches were tested on sentence-level and document-level data. The results show that the neural network-based method works sufficiently better than the statistical ones, achieving almost 98% accuracy at the sentence level and nearly 100% at the document level.
Anthology ID:
2022.digitam-1.2
Volume:
Proceedings of the Workshop on Processing Language Variation: Digital Armenian (DigitAm) within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
DigitAm
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
8–12
Language:
URL:
https://aclanthology.org/2022.digitam-1.2
DOI:
Bibkey:
Cite (ACL):
Karen Avetisyan. 2022. Dialects Identification of Armenian Language. In Proceedings of the Workshop on Processing Language Variation: Digital Armenian (DigitAm) within the 13th Language Resources and Evaluation Conference, pages 8–12, Marseille, France. European Language Resources Association.
Cite (Informal):
Dialects Identification of Armenian Language (Avetisyan, DigitAm 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.digitam-1.2.pdf