Yosuke Ujigawa
2025
Exploring the Impact of Modalities on Building Common Ground Using the Collaborative Scene Reconstruction Task
Yosuke Ujigawa
|
Asuka Shiotani
|
Masato Takizawa
|
Eisuke Midorikawa
|
Ryuichiro Higashinaka
|
Kazunori Takashio
Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology
To deepen our understanding of verbal and non-verbal modalities in establishing common ground, this study introduces a novel “collaborative scene reconstruction task.” In this task, pairs of participants, each provided with distinct image sets derived from the same video, work together to reconstruct the sequence of the original video. The level of agreement between the participants on the image order—quantified using Kendall’s rank correlation coefficient—serves as a measure of common ground construction. This approach enables the analysis of how various modalities contribute to the constraction of common ground. A corpus comprising 40 dialogues from 20 participants was collected and analyzed. The findings suggest that specific gestures play a significant role in fostering common ground, offering valuable insights for the development of dialogue systems that leverage multimodal information to enhance user the counstraction of common ground.