Abdullah Alsaleh

2022

pdf abs
LK2022 at Qur’an QA 2022: Simple Transformers Model for Finding Answers to Questions from Qur’an
Abdullah Alsaleh | Saud Althabiti | Ibtisam Alshammari | Sarah Alnefaie | Sanaa Alowaidi | Alaa Alsaqer | Eric Atwell | Abdulrahman Altahhan | Mohammad Alsalka
Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur'an QA and Fine-Grained Hate Speech Detection

Question answering is a specialized area in the field of NLP that aims to extract the answer to a user question from a given text. Most studies in this area focus on the English language, while other languages, such as Arabic, are still in their early stage. Recently, research tend to develop question answering systems for Arabic Islamic texts, which may impose challenges due to Classical Arabic. In this paper, we use Simple Transformers Question Answering model with three Arabic pre-trained language models (AraBERT, CAMeL-BERT, ArabicBERT) for Qur’an Question Answering task using Qur’anic Reading Comprehension Dataset. The model is set to return five answers ranking from the best to worst based on their probability scores according to the task details. Our experiments with development set shows that AraBERT V0.2 model outperformed the other Arabic pre-trained models. Therefore, AraBERT V0.2 was chosen for the the test set and it performed fair results with 0.45 pRR score, 0.16 EM score and 0.42 F1 score.

2021

pdf abs
Quranic Verses Semantic Relatedness Using AraBERT
Abdullah Alsaleh | Eric Atwell | Abdulrahman Altahhan
Proceedings of the Sixth Arabic Natural Language Processing Workshop

Bidirectional Encoder Representations from Transformers (BERT) has gained popularity in recent years producing state-of-the-art performances across Natural Language Processing tasks. In this paper, we used AraBERT language model to classify pairs of verses provided by the QurSim dataset to either be semantically related or not. We have pre-processed The QurSim dataset and formed three datasets for comparisons. Also, we have used both versions of AraBERT, which are AraBERTv02 and AraBERTv2, to recognise which version performs the best with the given datasets. The best results was AraBERTv02 with 92% accuracy score using a dataset comprised of label ‘2’ and label '-1’, the latter was generated outside of QurSim dataset.

Co-authors

Sanaa Alowaidi 1

Alaa Alsaqer 1

Mohammad Alsalka 1

Venues

wanlp1
osact1