Maria Schmidt

2020

pdf abs
How Users React to Proactive Voice Assistant Behavior While Driving
Maria Schmidt | Wolfgang Minker | Steffen Werner
Proceedings of the Twelfth Language Resources and Evaluation Conference

Nowadays Personal Assistants (PAs) are available in multiple environments and become increasingly popular to use via voice. Therefore, we aim to provide proactive PA suggestions to car drivers via speech. These suggestions should be neither obtrusive nor increase the drivers’ cognitive load, while enhancing user experience. To assess these factors, we conducted a usability study in which 42 participants perceive proactive voice output in a Wizard-of-Oz study in a driving simulator. Traffic density was varied during a highway drive and it included six in-car-specific use cases. The latter were presented by a proactive voice assistant and in a non-proactive control condition. We assessed the users’ subjective cognitive load and their satisfaction in different questionnaires during the interaction with both PA variants. Furthermore, we analyze the user reactions: both regarding their content and the elapsed response times to PA actions. The results show that proactive assistant behavior is rated similarly positive as non-proactive behavior. Furthermore, the participants agreed to 73.8% of proactive suggestions. In line with previous research, driving-relevant use cases receive the best ratings, here we reach 82.5% acceptance. Finally, the users reacted significantly faster to proactive PA actions, which we interpret as less cognitive load compared to non-proactive behavior.

2018

pdf
Towards an Automatic Assessment of Crowdsourced Data for NLU
Patricia Braunger | Wolfgang Maier | Jan Wessling | Maria Schmidt
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf abs
A Comparative Analysis of Crowdsourced Natural Language Corpora for Spoken Dialog Systems
Patricia Braunger | Hansjörg Hofmann | Steffen Werner | Maria Schmidt
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Recent spoken dialog systems have been able to recognize freely spoken user input in restricted domains thanks to statistical methods in the automatic speech recognition. These methods require a high number of natural language utterances to train the speech recognition engine and to assess the quality of the system. Since human speech offers many variants associated with a single intent, a high number of user utterances have to be elicited. Developers are therefore turning to crowdsourcing to collect this data. This paper compares three different methods to elicit multiple utterances for given semantics via crowd sourcing, namely with pictures, with text and with semantic entities. Specifically, we compare the methods with regard to the number of valid data and linguistic variance, whereby a quantitative and qualitative approach is proposed. In our study, the method with text led to a high variance in the utterances and a relatively low rate of invalid data.

2015

Co-authors

Venues

lrec3
sigdial1