Sandra Petel


2006

pdf
The Ritel Corpus - An annotated Human-Machine open-domain question answering spoken dialog corpus
Sophie Rosset | Sandra Petel
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we present a real (as opposed to Wizard-of-Oz) Human-Computer QA-oriented spoken dialog corpus collected with our Ritel platform. This corpus has been orthographically transcribed and annotated in terms of Specific Entities and Topics. Twelve main topics have been chosen. They are refined into 22 sub-topics. The Specific Entities are from five categories and cover Named Entities, linguistic entities, topic-defining entities, general entities and extended entities. The corpus contains 582 dialogs for 6 hours of user speech.
Search
Co-authors
Venues