CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic

Ahmed Elshabrawy, Muhammed AbuOdeh, Go Inoue, Nizar Habash


Abstract
We present CamelParser2.0, an open-source Python-based Arabic dependency parser targeting two popular Arabic dependency formalisms, the Columbia Arabic Treebank (CATiB), and Universal Dependencies (UD). The CamelParser2.0 pipeline handles the processing of raw text and produces tokenization, part-of-speech and rich morphological features. As part of developing CamelParser2.0, we explore many system design hyper-parameters, such as parsing model architecture and pretrained language model selection, achieving new state-of-the-art performance across diverse Arabic genres under gold and predicted tokenization settings.
Anthology ID:
2023.arabicnlp-1.15
Volume:
Proceedings of ArabicNLP 2023
Month:
December
Year:
2023
Address:
Singapore (Hybrid)
Editors:
Hassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Ahmed Abdelali, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Khalil Mrini, Rawan Almatham
Venues:
ArabicNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
170–180
Language:
URL:
https://aclanthology.org/2023.arabicnlp-1.15
DOI:
10.18653/v1/2023.arabicnlp-1.15
Bibkey:
Cite (ACL):
Ahmed Elshabrawy, Muhammed AbuOdeh, Go Inoue, and Nizar Habash. 2023. CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic. In Proceedings of ArabicNLP 2023, pages 170–180, Singapore (Hybrid). Association for Computational Linguistics.
Cite (Informal):
CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic (Elshabrawy et al., ArabicNLP-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp22-frontmatter/2023.arabicnlp-1.15.pdf