2024
pdf
abs
KSAA-CAD Shared Task: Contemporary Arabic Dictionary for Reverse Dictionary and Word Sense Disambiguation
Waad Alshammari
|
Amal Almazrua
|
Asma Al Wazrah
|
Rawan Almatham
|
Muneera Alhoshan
|
Abdulrahman Alosaimy
Proceedings of The Second Arabic Natural Language Processing Conference
This paper outlines the KSAA-CAD shared task, highlighting the Contemporary Arabic Language Dictionary within the scenario of developing a Reverse Dictionary (RD) system and enhancing Word Sense Disambiguation (WSD) capabilities. The first KSAA-RD (Al-Matham et al., 2023) highlighted significant gaps in the domain of RDs, which are designed to retrieve words by their meanings or definitions. This shared task comprises two tasks: RD and WSD. The RD task focuses on identifying word embeddings that most accurately match a given definition, termed a “gloss,” in Arabic. Conversely, the WSD task involves determining the specific meaning of a word in context, particularly when the word has multiple meanings. The winning team achieved the highest-ranking score of 0.0644 in RD using Electra embeddings. In this paper, we describe the methods employed by the participating teams and provide insights into the future direction of KSAA-CAD.
pdf
abs
muNERa at WojoodNER 2024: Multi-tasking NER Approach
Nouf Alotaibi
|
Haneen Alhomoud
|
Hanan Murayshid
|
Waad Alshammari
|
Nouf Alshalawi
|
Sakhar Alkhereyf
Proceedings of The Second Arabic Natural Language Processing Conference
This paper presents our system “muNERa”, submitted to the WojoodNER 2024 shared task at the second ArabicNLP conference. We participated in two subtasks, the flat and nested fine-grained NER sub-tasks (1 and 2). muNERa achieved first place in the nested NER sub-task and second place in the flat NER sub-task. The system is based on the TANL framework (CITATION),by using a sequence-to-sequence structured language translation approach to model both tasks. We utilize the pre-trained AraT5v2-base model as the base model for the TANL framework. The best-performing muNERa model achieves 91.07% and 90.26% for the F-1 scores on the test sets for the nested and flat subtasks, respectively.
pdf
abs
AraMed: Arabic Medical Question Answering using Pretrained Transformer Language Models
Ashwag Alasmari
|
Sarah Alhumoud
|
Waad Alshammari
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024
Medical Question Answering systems have gained significant attention in recent years due to their potential to enhance medical decision-making and improve patient care. However, most of the research in this field has focused on English-language datasets, limiting the generalizability of MQA systems to non-English speaking regions. This study introduces AraMed, a large-scale Arabic Medical Question Answering dataset addressing the limited resources available for Arabic medical question answering. AraMed comprises of 270k question-answer pairs based on health consumer questions submitted to online medical forum. Experiments using various deep learning models showcase the dataset’s effectiveness, particularly with AraBERT models achieving highest results, specifically AraBERTv2 obtained an F1 score of 96.73% in the answer selection task. The comparative analysis of different deep learning models provides insights into their strengths and limitations. These findings highlight the potential of AraMed for advancing Arabic medical question answering research and development.
2023
pdf
abs
KSAA-RD Shared Task: Arabic Reverse Dictionary
Rawan Al-Matham
|
Waad Alshammari
|
Abdulrahman AlOsaimy
|
Sarah Alhumoud
|
Asma Wazrah
|
Afrah Altamimi
|
Halah Alharbi
|
Abdullah Alaifi
Proceedings of ArabicNLP 2023
This paper outlines the KSAA-RD shared task, which aims to develop a Reverse Dictionary (RD) system for the Arabic language. RDs allow users to find words based on their meanings or definition. This shared task, KSAA-RD, includes two subtasks: Arabic RD and cross-lingual reverse dictionaries (CLRD). Given a definition (referred to as a “gloss”) in either Arabic or English, the teams compete to find the most similar word embeddings of their corresponding word. The winning team achieved 24.20 and 12.70 for RD and CLRD, respectively in terms of rank metric. In this paper, we describe the methods employed by the participating teams and offer an outlook for KSAA-RD.