Michael Kaisser

Also published as: Michael Kaißer


2012

2008

Each year NIST releases a set of question, document id, answer-triples for the factoid questions used in the TREC Question Answering track. While this resource is widely used and proved itself useful for many purposes, it also is too coarse a grain-size for a lot of other purposes. In this paper we describe how we have used Amazon’s Mechanical Turk to have multiple subjects read the documents and identify the sentences themselves which contain the answer. For most of the 1911 questions in the test sets from 2002 to 2006 and each of the documents said to contain an answer, the Question-Answer Sentence Pairs (QASP) corpus introduced in this paper contains the identified answer sentences. We believe that this corpus, which we will make available to the public, can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern.

2007

2006

We describe a corpus of multimodal dialogues with an MP3player collected in Wizard-of-Oz experiments and annotated with a richfeature set at several layers. We are using the Nite XML Toolkit (NXT) to represent and further process the data. We designed an NXTdata model, converted experiment log file data and manualtranscriptions into NXT, and are building tools for additionalannotation using NXT libraries. The annotated corpus will be used to (i) investigate various aspects of multimodal presentation andinteraction strategies both within and across annotation layers; (ii) design an initial policy for reinforcement learning of multimodalclarification requests.

2005