This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AgnesLisowska
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
During the construction of a spoken dialogue system much effort is spent on improving the quality of speech recognition as possible. However, even if an application perfectly recognizes the input, its understanding may be far from what the user originally meant. The user should be informed about what the system actually understood so that an error will not have a negative impact in the later stages of the dialogue. One important aspect that this work tries to address is the effect of presenting the systems understanding during interaction with users. We argue that for specific kinds of applications its important to confirm the understanding of the system before obtaining the output. In this way the user can avoid misconceptions and problems occurring in the dialogue flow and he can enhance his confidence in the system. Nevertheless this has an impact on the interaction, as the mental workload increases, and the users behavior may adapt to the systems coverage. We focus on two applications that implement the notion of rephrasing users input in a different way. Our study took place among 14 subjects that used both systems on a Nokia N810 Internet Tablet.
This paper describes the Relative Ordering Tool for Evaluation (ROTE) which is designed to support the process of building a parameterised quality model for evaluation. It is a very simple tool which enables users to specify the relative importance of quality characteristics (and associated metrics) to reflect the users' particular requirements. The tool allows users to order any number of quality characteristics by comparing them in a pair-wise fashion. The tool was developed in the context of a collaborative project developing a text mining system. A full scale evaluation of the text mining system was designed and executed for three different users and the ROTE tool was successfully applied by those users during that process. The tool will be made available for general use by the evaluation community.
In this paper we present a proposal for extending the standard Wizard of Oz experimental methodology to language-enabled multimodal systems. We first discuss how Wizard of Oz experiments involving multimodal systems differ from those involving voice-only systems. We then go on to discuss the Extended Wizard of Oz methodology and the Wizard of Oz testing environment and protocol that we have developed. We then describe an example of applying this methodology to Archivus, a multimodal system for multimedia meeting retrieval and browsing. We focus in particular on the tools that the wizards would need to successfully and efficiently perform their tasks in a multimodal context. We conclude with some general comments about which questions need to be addressed when developing and using the Wizard of Oz methodology for testing multimodal systems.
The Parmenides project developed a text mining application applied in three different domains exemplified by case studies for the three user partners in the project. During the lifetime of the project (and in parallel with the development of the system itself) an evaluation framework was developed by the authors in conjunction with the users, and was eventually applied to the system. The object of the exercise was two-fold: firstly to develop and perform a complete user-centered evaluation of the system to assess how well it answered the users' requirements and, secondly, to develop a general framework which could be applied in the context of other users' requirements and (with some modification) to similar systems. In this paper we describe not only the framework but the process of building and parameterising the quality model for each case study and, perhaps most interestingly, the way in which the quality model and users' requirements and expectations evolved over time.