@inproceedings{habash-etal-2016-exploiting,
    title = "Exploiting {A}rabic Diacritization for High Quality Automatic Annotation",
    author = "Habash, Nizar  and
      Shahrour, Anas  and
      Al-Khalil, Muhamed",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Declerck, Thierry  and
      Goggi, Sara  and
      Grobelnik, Marko  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Mazo, Helene  and
      Moreno, Asuncion  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation ({LREC}'16)",
    month = may,
    year = "2016",
    address = "Portoro{\v{z}}, Slovenia",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/iwcs-25-ingestion/L16-1681/",
    pages = "4298--4304",
    abstract = "We present a novel technique for Arabic morphological annotation. The technique utilizes diacritization to produce morphological annotations of quality comparable to human annotators. Although Arabic text is generally written without diacritics, diacritization is already available for large corpora of Arabic text in several genres. Furthermore, diacritization can be generated at a low cost for new text as it does not require specialized training beyond what educated Arabic typists know. The basic approach is to enrich the input to a state-of-the-art Arabic morphological analyzer with word diacritics (full or partial) to enhance its performance. When applied to fully diacritized text, our approach produces annotations with an accuracy of over 97{\%} on lemma, part-of-speech, and tokenization combined."
}Markdown (Informal)
[Exploiting Arabic Diacritization for High Quality Automatic Annotation](https://preview.aclanthology.org/iwcs-25-ingestion/L16-1681/) (Habash et al., LREC 2016)
ACL