Hayastan Avetisyan


2025

pdf bib
VerbCraft: Morphologically-Aware Armenian Text Generation Using LLMs in Low-Resource Settings
Hayastan Avetisyan | David Broneske
Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025)

Understanding and generating morphologically complex verb forms is a critical challenge in Natural Language Processing (NLP), particularly for low-resource languages like Armenian. Armenian’s verb morphology encodes multiple layers of grammatical information, such as tense, aspect, mood, voice, person, and number, requiring nuanced computational modeling. We introduce VerbCraft, a novel neural model that integrates explicit morphological classifiers into the mBART-50 architecture. VerbCraft achieves a BLEU score of 0.4899 on test data, compared to the baseline’s 0.9975, reflecting its focus on prioritizing morphological precision over fluency. With over 99% accuracy in aspect and voice predictions and robust performance on rare and irregular verb forms, VerbCraft addresses data scarcity through synthetic data generation with human-in-the-loop validation. Beyond Armenian, it offers a scalable framework for morphologically rich, low-resource languages, paving the way for linguistically informed NLP systems and advancing language preservation efforts.

2023

pdf bib
Large Language Models and Low-Resource Languages: An Examination of Armenian NLP
Hayastan Avetisyan | David Broneske
Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)

2021

pdf bib
Identifying and Understanding Game-Framing in Online News: BERT and Fine-Grained Linguistic Features
Hayastan Avetisyan | David Broneske
Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)