Yossra Noureldien
2025
Zero-Shot and Fine-Tuned Evaluation of Generative LLMs for Arabic Word Sense Disambiguation
Yossra Noureldien
|
Abdelrazig Mohamed
|
Farah Attallah
Proceedings of The Third Arabic Natural Language Processing Conference
Arabic presents unique challenges for sense level language understanding due to its rich morphology and semantic ambiguity. This paper benchmarks large generative language models (LLMs) for Arabic Word Sense Disambiguation (WSD) under both zero-shot and fine-tuning conditions. We evaluate one proprietary model (GPT-4o) and three opensource models (LLaMA 3.1-8B, Qwen 2.5-7B, and Gemma 2-9B) on two publicly available datasets. In zero-shot settings, GPT-4o achieved the highest overall performance, with comparable results across both datasets, reaching 79% accuracy and an average macro-F1 score of 66.08%. Fine-tuning, however, notably elevated all open models beyond GPT4o’s zero-shot results. Qwen achieved the top scores on one dataset, with an accuracy of 90.77% and a macro-F1 score of 83.98%, while LLaMA scored highest on the other, reaching an accuracy of 88.51% and a macroF1 score of 69.41%. These findings demonstrate that parameter-efficient supervised adaptation can close much of the performance gap and establish strong, reproducible baselines for Arabic WSD using open-source, relatively medium-sized models. Full code is publicly available.
Athar at QIAS2025: LLM-based Question Answering Systems for Islamic Inheritance and Classical Islamic Knowledge
Yossra Noureldien
|
Hassan Suliman
|
Farah Attallah
|
Abdelrazig Mohamed
|
Sara Abdalla
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks