@inproceedings{darwish-etal-2017-arabic,
    title = "{A}rabic Diacritization: Stats, Rules, and Hacks",
    author = "Darwish, Kareem  and
      Mubarak, Hamdy  and
      Abdelali, Ahmed",
    editor = "Habash, Nizar  and
      Diab, Mona  and
      Darwish, Kareem  and
      El-Hajj, Wassim  and
      Al-Khalifa, Hend  and
      Bouamor, Houda  and
      Tomeh, Nadi  and
      El-Haj, Mahmoud  and
      Zaghouani, Wajdi",
    booktitle = "Proceedings of the Third {A}rabic Natural Language Processing Workshop",
    month = apr,
    year = "2017",
    address = "Valencia, Spain",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W17-1302",
    doi = "10.18653/v1/W17-1302",
    pages = "9--17",
    abstract = "In this paper, we present a new and fast state-of-the-art Arabic diacritizer that guesses the diacritics of words and then their case endings. We employ a Viterbi decoder at word-level with back-off to stem, morphological patterns, and transliteration and sequence labeling based diacritization of named entities. For case endings, we use Support Vector Machine (SVM) based ranking coupled with morphological patterns and linguistic rules to properly guess case endings. We achieve a low word level diacritization error of 3.29{\%} and 12.77{\%} without and with case endings respectively on a new multi-genre free of copyright test set. We are making the diacritizer available for free for research purposes.",
}
Markdown (Informal)
[Arabic Diacritization: Stats, Rules, and Hacks](https://aclanthology.org/W17-1302) (Darwish et al., WANLP 2017)
ACL
- Kareem Darwish, Hamdy Mubarak, and Ahmed Abdelali. 2017. Arabic Diacritization: Stats, Rules, and Hacks. In Proceedings of the Third Arabic Natural Language Processing Workshop, pages 9–17, Valencia, Spain. Association for Computational Linguistics.