Kareem Mohamed Darwish
2026
From RAG to Agentic RAG for Faithful Islamic Question Answering
Gagan Bhatia | Hamdy Mubarak | Mustafa Jarrar | George Mikros | Fadi Zaraket | Mahmoud Alhirthani | Mutaz al-Khatib | Logan Cochrane | Kareem Mohamed Darwish | Rashid Yahiaoui | Firoj Alam
Findings of the Association for Computational Linguistics: ACL 2026
Gagan Bhatia | Hamdy Mubarak | Mustafa Jarrar | George Mikros | Fadi Zaraket | Mahmoud Alhirthani | Mutaz al-Khatib | Logan Cochrane | Kareem Mohamed Darwish | Rashid Yahiaoui | Firoj Alam
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models (LLMs) are increasingly used for Islamic question answering, where ungrounded responses may carry serious religious consequences. Yet standard MCQ/MRC-style evaluations do not capture key real-world failure modes, notably free-form hallucinations and the ability to abstain when evidence is insufficient. To address this gap, we introduce IslamicFaithQA, a 3,810-item bilingual (Arabic/English) **generative** benchmark with atomic single-gold answers, which enables direct measurement of hallucination and abstention. We additionally developed an end-to-end grounded Islamic modeling suite consisting of *(i)* 25K Arabic text-grounded SFT reasoning pairs, *(ii)* 5K bilingual preference samples for reward-guided alignment, and *(iii)* a verse-level Qur’an retrieval corpus of ∼6k atomic *verses* (ayat). Building on these resources, we develop an agentic Quran-grounding framework (agentic RAG) that uses structured tool calls for iterative evidence seeking and answer revision. Experiments across Arabic-centric and multilingual LLMs show that retrieval improves correctness and that agentic RAG yields the largest gains beyond standard RAG, achieving state-of-the-art performance and stronger Arabic–English robustness even with a small model (i.e., Qwen3 4B). We made the datasets are publicly available (https://huggingface.co/datasets/QCRI/IslamicFaithQA).
Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA
Ummar Abbas | Mourad Ouzzani | Mohamed Y. Eltabakh | Omar Sinan | Gagan Bhatia | Hamdy Mubarak | Majd Hawasly | Mohammed Qusay Hashim | Kareem Mohamed Darwish | Firoj Alam
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Ummar Abbas | Mourad Ouzzani | Mohamed Y. Eltabakh | Omar Sinan | Gagan Bhatia | Hamdy Mubarak | Majd Hawasly | Mohammed Qusay Hashim | Kareem Mohamed Darwish | Firoj Alam
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Large language models (LLMs) can answer religious knowledge queries fluently, yet they often hallucinate and misattribute sources, which is especially consequential in Islamic settings where users expect grounding in canonical texts (Qur’an and Hadith) and jurisprudential (fiqh) nuance. Retrieval-augmented generation (RAG) improves grounding, however, a single retrieve-then-generate pipeline is insufficient for diverse Islamic queries, including verbatim scripture, citation-grounded guidance, and rule-constrained computations such as zakat and inheritance. To address these challenges, we present Fanar-Sadiq, a bilingual Arabic-English Islamic QA system built on a multi-agent, tool-augmented architecture. It is a core component of the Fanar AI platform. Fanar-Sadiq routes Islamic queries to specialized modules within an agentic tool architecture. It supports intent-aware routing, retrieval-grounded fiqh answers with normalized citations and verification traces, exact verse lookup with quotation validation, and deterministic Sunni zakat and inheritance calculators with madhhab-sensitive branching. We evaluate the end-to-end system on public Islamic QA benchmarks and show strong effectiveness and efficiency. It is publicly accessible through an API and Web application and has received over 1.9M accesses in less than a year (https://api.fanar.qa/docs).
2025
IslamicEval 2025: The First Shared Task of Capturing LLMs Hallucination in Islamic Content
Hamdy Mubarak | Rana Malhas | Watheq Mansour | Abubakr Mohamed | Mahmoud Fawzi | Majd Hawasly | Tamer Elsayed | Kareem Mohamed Darwish | Walid Magdy
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
Hamdy Mubarak | Rana Malhas | Watheq Mansour | Abubakr Mohamed | Mahmoud Fawzi | Majd Hawasly | Tamer Elsayed | Kareem Mohamed Darwish | Walid Magdy
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning
Asım Ersoy | Enes Altinisik | Kareem Mohamed Darwish | Husrev Taha Sencar
Proceedings of The Third Arabic Natural Language Processing Conference
Asım Ersoy | Enes Altinisik | Kareem Mohamed Darwish | Husrev Taha Sencar
Proceedings of The Third Arabic Natural Language Processing Conference
Tool calling is a critical capability that allows Large Language Models (LLMs) to interact with external systems, significantly expanding their utility. However, research and resources for tool calling are predominantly English-centric, leaving a gap in our understanding of how to enable this functionality for other languages, such as Arabic. This paper investigates three key research questions: (1) the necessity of in-language (Arabic) tool-calling data versus relying on cross-lingual transfer, (2) the effect of general-purpose instruction tuning on tool-calling performance, and (3) the value of fine-tuning on specific, high-priority tools. To address these questions, we conduct extensive experiments using base and post-trained variants of an open-weight Arabic LLM. To enable this study, we bridge the resource gap by translating and adapting two open-source tool-calling datasets into Arabic. Our findings provide crucial insights into the optimal strategies for developing robust tool-augmented agents for Arabic.
Search
Fix author
Co-authors
- Hamdy Mubarak 3
- Firoj Alam 2
- Gagan Bhatia 2
- Majd Hawasly 2
- Ummar Abbas 1
- Mahmoud Alhirthani 1
- Enes Altinisik 1
- Logan Cochrane 1
- Tamer Elsayed 1
- Mohamed Y. Eltabakh 1
- Asım Ersoy 1
- Mahmoud Fawzi 1
- Mohammed Qusay Hashim 1
- Mustafa Jarrar 1
- Walid Magdy 1
- Rana Malhas 1
- Watheq Mansour 1
- George Mikros 1
- Abubakr Mohamed 1
- Mourad Ouzzani 1
- Husrev Taha Sencar 1
- Omar Sinan 1
- Rashid Yahiaoui 1
- Fadi A. Zaraket 1
- Mutaz al-Khatib 1