Abstract
In this paper, a pilot study for the development of a corpus of Dutch Aphasic Speech (CoDAS) is presented. Given the lack of resources of this kind not only for Dutch but also for other languages, CoDAS will be able to set standards and will contribute to the future research in this area. Given the special character of the speech contained in CoDAS, we cannot simply carry over the design and annotation protocols of existing corpora, such as the Corpus Gesproken Nederlands or CHILDES. However, they have been assumed as starting point. We have investigated whether and how the procedures and protocols for the annotation (part-of-speech tagging) and transcription (orthographic and phonetic) used for the CGN should be adapted in order to annotate and transcribe aphasic speech properly. Besides, we have established the basic requirements with respect to text types, metadata, and annotation levels that CoDAS should fulfill.- Anthology ID:
- L06-1417
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/672_pdf.pdf
- DOI:
- Cite (ACL):
- Eline Westerhout and Paola Monachesi. 2006. A pilot study for a Corpus of Dutch Aphasic Speech (CoDAS). In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- A pilot study for a Corpus of Dutch Aphasic Speech (CoDAS) (Westerhout & Monachesi, LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/672_pdf.pdf