GesNavi: Gesture-guided Outdoor Vision-and-Language Navigation

Aman Jain; Teruhisa Misu; Kentaro Yamada; Hitomi Yanaka

GesNavi: Gesture-guided Outdoor Vision-and-Language Navigation

Aman Jain, Teruhisa Misu, Kentaro Yamada, Hitomi Yanaka

Abstract

Vision-and-Language Navigation (VLN) task involves navigating mobility using linguistic commands and has application in developing interfaces for autonomous mobility. In reality, natural human communication also encompasses non-verbal cues like hand gestures and gaze. These gesture-guided instructions have been explored in Human-Robot Interaction systems for effective interaction, particularly in object-referring expressions. However, a notable gap exists in tackling gesture-based demonstrative expressions in outdoor VLN task. To address this, we introduce a novel dataset for gesture-guided outdoor VLN instructions with demonstrative expressions, designed with a focus on complex instructions requiring multi-hop reasoning between the multiple input modalities. In addition, our work also includes a comprehensive analysis of the collected data and a comparative evaluation against the existing datasets.

Anthology ID:: 2024.eacl-srw.23
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Neele Falk, Sara Papi, Mike Zhang
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 290–295
Language:
URL:: https://aclanthology.org/2024.eacl-srw.23
DOI:
Bibkey:
Cite (ACL):: Aman Jain, Teruhisa Misu, Kentaro Yamada, and Hitomi Yanaka. 2024. GesNavi: Gesture-guided Outdoor Vision-and-Language Navigation. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 290–295, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: GesNavi: Gesture-guided Outdoor Vision-and-Language Navigation (Jain et al., EACL 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/2024.eacl-srw.23.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-3/2024.eacl-srw.23.mp4

PDF Search Video