Abrar Abir
2026
LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring
May Bashendy | Walid Massoud | Sohaila Eltanbouly | Salam Albatarni | Marwan Sayed | Abrar Abir | Houda Bouamor | Tamer Elsayed
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
May Bashendy | Walid Massoud | Sohaila Eltanbouly | Salam Albatarni | Marwan Sayed | Abrar Abir | Houda Bouamor | Tamer Elsayed
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Automated Essay Scoring (AES) has gained increasing attention in recent years, yet research on Arabic AES remains limited due to the lack of publicly available datasets. To address this, we introduce LAILA, the largest publicly available Arabic AES dataset to date, comprising 7,859 essays annotated with holistic and trait-specific scores on seven dimensions: relevance, organization, vocabulary, style, development, mechanics, and grammar. We detail the dataset design, collection, and annotations, and provide benchmark results using state-of-the-art Arabic and English models in prompt-specific and cross-prompt settings. LAILA fills a critical need in Arabic AES research, supporting the development of robust scoring systems.
2024
Nullpointer at ArAIEval Shared Task: Arabic Propagandist Technique Detection with Token-to-Word Mapping in Sequence Tagging
Abrar Abir | Kemal Oflazer
Proceedings of the Second Arabic Natural Language Processing Conference
Abrar Abir | Kemal Oflazer
Proceedings of the Second Arabic Natural Language Processing Conference
This paper investigates the optimization of propaganda technique detection in Arabic text, including tweets & news paragraphs, from ArAIEval shared task 1. Our approach involves fine-tuning the AraBERT v2 model with a neural network classifier for sequence tagging.Experimental results show relying on the first token of the word for technique prediction produces the best performance. In addition, incorporating genre information as a feature further enhances the model’s performance. Our system achieved a score of 25.41, placing us 4th on the leaderboard. Subsequent post-submission improvements further raised our score to 26.68.