This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AkinoriIto
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
There are high expectations for multimodal dialog systems that can make natural small talk with facial expressions, gestures, and gaze actions as next-generation dialog-based systems. Two important roles of the chat-talk system are keeping the user engaged and establishing rapport. Many studies have conducted user evaluations of such systems, some of which reported that considering the relationship with the user is an effective way to improve the subjective evaluation. To facilitate research of such dialog systems, we are currently constructing a large-scale multimodal dialog corpus focusing on the relationship between speakers. In this paper, we describe the data collection and annotation process, and analysis of the corpus collected in the early stage of the project. This corpus contains 19,303 utterances (10 hours) from 19 pairs of participants. A dialog act tag is annotated to each utterance by two annotators. We compare the frequency and the transition probability of the tags between different closeness levels to help construct a dialog system for establishing a relationship with the user.
This paper examines a method to improve the user impression of a spoken dialog system by introducing a mechanism that gradually changes form of utterances every time the user uses the system. In some languages, including Japanese, the form of utterances changes corresponding to social relationship between the talker and the listener. Thus, this mechanism can be effective to express the system’s intention to make social distance to the user closer; however, an actual effect of this method is not investigated enough when introduced to the dialog system. In this paper, we conduct dialog experiments and show that controlling the form of system utterances can improve the users’ impression.
This paper explores the effect of emotional speech synthesis on a spoken dialogue system when the dialogue is non-task-oriented. Although the use of emotional speech responses have been shown to be effective in a limited domain, e.g., scenario-based and counseling dialogue, the effect is still not clear in the non-task-oriented dialogue such as voice chatting. For this purpose, we constructed a simple dialogue system with example- and rule-based dialogue management. In the system, two types of emotion labeling with emotion estimation are adopted, i.e., system-driven and user-cooperative emotion labeling. We conducted a dialogue experiment where subjects evaluate the subjective quality of the system and the dialogue from the multiple aspects such as richness of the dialogue and impression of the agent. We then analyze and discuss the results and show the advantage of using appropriate emotions for the expressive speech responses in the non-task-oriented system.