An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue
Koji Inoue, Divesh Lala, Mikey Elmers, Keiko Ochi, Tatsuya Kawahara
Abstract
Handling multi-party dialogues represents a significant step for advancing spoken dialogue systems, necessitating the development of tasks specific to multi-party interactions. To address this challenge, we are constructing a multi-modal multi-party dialogue corpus of triadic (three-participant) discussions. This paper focuses on the task of addressee recognition, identifying who is being addressed to take the next turn, a critical component unique to multi-party dialogue systems. A subset of the corpus was annotated with addressee information, revealing that explicit addressees are indicated in approximately 20% of conversational turns. To evaluate the task’s complexity, we benchmarked the performance of a large language model (GPT-4o) on addressee recognition. The results showed that GPT-4o achieved an accuracy only marginally above chance, underscoring the challenges of addressee recognition in multi-party dialogue. These findings highlight the need for further research to enhance the capabilities of large language models in understanding and navigating the intricacies of multi-party conversational dynamics.- Anthology ID:
- 2025.iwsds-1.36
- Volume:
- Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology
- Month:
- May
- Year:
- 2025
- Address:
- Bilbao, Spain
- Editors:
- Maria Ines Torres, Yuki Matsuda, Zoraida Callejas, Arantza del Pozo, Luis Fernando D'Haro
- Venues:
- IWSDS | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 330–334
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.iwsds-1.36/
- DOI:
- Cite (ACL):
- Koji Inoue, Divesh Lala, Mikey Elmers, Keiko Ochi, and Tatsuya Kawahara. 2025. An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue. In Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology, pages 330–334, Bilbao, Spain. Association for Computational Linguistics.
- Cite (Informal):
- An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue (Inoue et al., IWSDS 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.iwsds-1.36.pdf