Yuki Irie


2006

pdf
Layered Speech-Act Annotation for Spoken Dialogue Corpus
Yuki Irie | Shigeki Matsubara | Nobuo Kawaguchi | Yukiko Yamaguchi | Yasuyoshi Inagaki
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the design of speech act tags for spoken dialogue corpora and its evaluation. Compared with the tags used for conventional corpus annotation, the proposed speech intention tag is specialized enough to determine system operations. However, detailed information description increases tag types. This causes an ambiguous tag selection. Therefore, we have designed an organization of tags, with focusing attention on layered tagging and context-dependent tagging. Over 35,000 utterance units in the CIAIR corpus have been tagged by hand. To evaluate the reliability of the intention tag, a tagging experiment was conducted. The reliability of tagging is evaluated by comparing the tagging among some annotators using kappa value. As a result, we confirmed that reliable data could be built. This corpus with speech intention tag could be widely used from basic research to applications of spoken dialogue. In particular, this would play an important role from the viewpoint of practical use of spoken dialogue corpora.