Open-domain Arabic Conversational Question Answering with Question Rewriting

Mariam E. Hassib, Nagwa El-Makky, Marwan Torki


Abstract
Conversational question-answering (CQA) plays a crucial role in bridging the gap between human language and machine understanding, enabling more natural and interactive interactions with AI systems. In this work, we present the first results on open-domain Arabic CQA using deep learning. We introduce AraQReCC, a large-scale Arabic CQA dataset containing 9K conversations with 62K question-answer pairs, created by translating a subset of the QReCC dataset. To ensure data quality, we used COMET-based filtering and manual ratings from large language models (LLMs), such as GPT-4 and LLaMA, selecting conversations with COMET scores, along with LLM ratings of 4 or more. AraQReCC facilitates advanced research in Arabic CQA, improving clarity and relevance through question rewriting. We applied AraT5 for question rewriting and used BM25 and Dense Passage Retrieval (DPR) for passage retrieval. AraT5 is also used for question answering, completing the end-to-end system. Our experiments show that the best performance is achieved with DPR, attaining an F1 score of 21.51% on the test set. While this falls short of the human upper bound of 40.22%, it underscores the importance of question rewriting and quality-controlled data in enhancing system performance.
Anthology ID:
2025.arabicnlp-main.7
Volume:
Proceedings of The Third Arabic Natural Language Processing Conference
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:
ArabicNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
84–96
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-main.7/
DOI:
Bibkey:
Cite (ACL):
Mariam E. Hassib, Nagwa El-Makky, and Marwan Torki. 2025. Open-domain Arabic Conversational Question Answering with Question Rewriting. In Proceedings of The Third Arabic Natural Language Processing Conference, pages 84–96, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Open-domain Arabic Conversational Question Answering with Question Rewriting (Hassib et al., ArabicNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-main.7.pdf