DARES: Dataset for Arabic Readability Estimation of School Materials
Mo El-Haj, Sultan Almujaiwel, Damith Premasiri, Tharindu Ranasinghe, Ruslan Mitkov
Abstract
This research introduces DARES, a dataset for assessing the readability of Arabic text in Saudi school materials. DARES compromise of 13335 instances from textbooks used in 2021 and contains two subtasks; (a) Coarse-grained readability assessment where the text is classified into different educational levels such as primary and secondary. (b) Fine-grained readability assessment where the text is classified into individual grades.. We fine-tuned five transformer models that support Arabic and found that CAMeLBERTmix performed the best in all input settings. Evaluation results showed high performance for the coarse-grained readability assessment task, achieving a weighted F1 score of 0.91 and a macro F1 score of 0.79. The fine-grained task achieved a weighted F1 score of 0.68 and a macro F1 score of 0.55. These findings demonstrate the potential of our approach for advancing Arabic text readability assessment in education, with implications for future innovations in the field.- Anthology ID:
- 2024.determit-1.10
- Volume:
- Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Giorgio Maria Di Nunzio, Federica Vezzani, Liana Ermakova, Hosein Azarbonyad, Jaap Kamps
- Venues:
- DeTermIt | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 103–113
- Language:
- URL:
- https://aclanthology.org/2024.determit-1.10
- DOI:
- Cite (ACL):
- Mo El-Haj, Sultan Almujaiwel, Damith Premasiri, Tharindu Ranasinghe, and Ruslan Mitkov. 2024. DARES: Dataset for Arabic Readability Estimation of School Materials. In Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024, pages 103–113, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- DARES: Dataset for Arabic Readability Estimation of School Materials (El-Haj et al., DeTermIt-WS 2024)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/2024.determit-1.10.pdf