William J. Teahan

Also published as: W. J. Teahan, William J Teahan


2017

pdf
A New Error Annotation for Dyslexic texts in Arabic
Maha Alamri | William J Teahan
Proceedings of the Third Arabic Natural Language Processing Workshop

This paper aims to develop a new classification of errors made in Arabic by those suffering from dyslexia to be used in the annotation of the Arabic dyslexia corpus (BDAC). The dyslexic error classification for Arabic texts (DECA) comprises a list of spelling errors extracted from previous studies and a collection of texts written by people with dyslexia that can provide a framework to help analyse specific errors committed by dyslexic writers. The classification comprises 37 types of errors, grouped into nine categories. The paper also discusses building a corpus of dyslexic Arabic texts that uses the error annotation scheme and provides an analysis of the errors that were found in the texts.

2000

pdf
A compression based algorithm for Chinese word segmentation
W. J. Teahan | Yingying Wen | Rodger McNab | Ian H. Witten
Computational Linguistics, Volume 26, Number 3, September 2000