Toshiki Kawamoto


2024

pdf
Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks?– a Study of the Effects of “?” for Spoken Dialogue Systems –
Tomoya Mizumoto | Takato Yamazaki | Katsumasa Yoshikawa | Masaya Ohagi | Toshiki Kawamoto | Toshinori Sato
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

When individuals engage in spoken discourse, various phenomena can be observed that differ from those that are apparent in text-based conversation. While written communication commonly uses a question mark to denote a query, in spoken discourse, queries are frequently indicated by a rising intonation at the end of a sentence. However, numerous speech recognition engines do not append a question mark to recognized queries, presenting a challenge when creating a spoken dialogue system. Specifically, the absence of a question mark at the end of a sentence can impede the generation of appropriate responses to queries in spoken dialogue systems. Hence, we investigate the impact of question marks on dialogue systems, with the results showing that they have a significant impact. Moreover, we analyze specific examples in an effort to determine which types of utterances have the impact on dialogue systems.

2023

pdf
A Follow-up Study on Evaluation Metrics Using Follow-up Utterances
Toshiki Kawamoto | Yuki Okano | Takato Yamazaki | Toshinori Sato | Kotaro Funakoshi | Manabu Okumura
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

pdf
An Open-Domain Avatar Chatbot by Exploiting a Large Language Model
Takato Yamazaki | Tomoya Mizumoto | Katsumasa Yoshikawa | Masaya Ohagi | Toshiki Kawamoto | Toshinori Sato
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

With the ambition to create avatars capable of human-level casual conversation, we developed an open-domain avatar chatbot, situated in a virtual reality environment, that employs a large language model (LLM). Introducing the LLM posed several challenges for multimodal integration, such as developing techniques to align diverse outputs and avatar control, as well as addressing the issue of slow generation speed. To address these challenges, we integrated various external modules into our system. Our system is based on the award-winning model from the Dialogue System Live Competition 5. Through this work, we hope to stimulate discussions within the research community about the potential and challenges of multimodal dialogue systems enhanced with LLMs.

2022

pdf
Generating Repetitions with Appropriate Repeated Words
Toshiki Kawamoto | Hidetaka Kamigaito | Kotaro Funakoshi | Manabu Okumura
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

A repetition is a response that repeats words in the previous speaker’s utterance in a dialogue. Repetitions are essential in communication to build trust with others, as investigated in linguistic studies. In this work, we focus on repetition generation. To the best of our knowledge, this is the first neural approach to address repetition generation. We propose Weighted Label Smoothing, a smoothing method for explicitly learning which words to repeat during fine-tuning, and a repetition scoring method that can output more appropriate repetitions during decoding. We conducted automatic and human evaluations involving applying these methods to the pre-trained language model T5 for generating repetitions. The experimental results indicate that our methods outperformed baselines in both evaluations.