Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning
Yuxing Long, Huibin Zhang, Binyuan Hui, Zhenglu Yang, Caixia Yuan, Xiaojie Wang, Fei Huang, Yongbin Li
Abstract
To fulfill complex user requirements in a situated conversational scenario, the agent needs to conduct step-by-step multi-modal logic reasoning, which includes locating objects, querying information and searching objects. However, existing methods omit this multi-step procedure and therefore constitutes the risk of shortcuts when making predictions. For example, they may directly copy the information from the dialogue history or simply use the textual description without perform visual reasoning. To address this issue and further boost the system performance, we apply the dual process theory to plug a reasoner into the original transformer based model for step-by-step reasoning. When system 2 completes multi-step reasoning, its output is regarded as final prediction. Our proposed method achieved the 1st rank on the summing scores across all four DSTC-11 SIMMC 2.1 sub-tasks.- Anthology ID:
- 2023.dstc-1.3
- Volume:
- Proceedings of The Eleventh Dialog System Technology Challenge
- Month:
- September
- Year:
- 2023
- Address:
- Prague, Czech Republic
- Editors:
- Yun-Nung Chen, Paul Crook, Michel Galley, Sarik Ghazarian, Chulaka Gunasekara, Raghav Gupta, Behnam Hedayatnia, Satwik Kottur, Seungwhan Moon, Chen Zhang
- Venues:
- DSTC | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 15–24
- Language:
- URL:
- https://aclanthology.org/2023.dstc-1.3
- DOI:
- Cite (ACL):
- Yuxing Long, Huibin Zhang, Binyuan Hui, Zhenglu Yang, Caixia Yuan, Xiaojie Wang, Fei Huang, and Yongbin Li. 2023. Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning. In Proceedings of The Eleventh Dialog System Technology Challenge, pages 15–24, Prague, Czech Republic. Association for Computational Linguistics.
- Cite (Informal):
- Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning (Long et al., DSTC-WS 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.dstc-1.3.pdf