Tsunehiro Arimoto
2026
Multi-dimensional Evaluation of Character-Authentic Dialogue Models Learned from Question-Answer Data
Atsushi Otsuka | Kazuya Matsuo | Kenta Hama | Masahiro Mizukami | Tsunehiro Arimoto | Hiroaki Sugiyama | Makoto Nakatsuji | Narichika Nomoto
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Atsushi Otsuka | Kazuya Matsuo | Kenta Hama | Masahiro Mizukami | Tsunehiro Arimoto | Hiroaki Sugiyama | Makoto Nakatsuji | Narichika Nomoto
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Character-authentic dialogue remains challenging for large language models (LLMs) due to limited character-specific data, generic-style collapse, and hallucinations regarding persona facts. Our work presents a comparative evaluation of several learning strategies for character dialogue grounded in question–answer (QA) data, comparing zero/few-shot prompting, supervised fine-tuning (SFT), direct preference optimization (DPO), and a hybrid approach that integrates retrieval-augmented character profiles and knowledge with policy optimization. Using both single-turn and multi-turn settings, we assess multiple dimensions central to character dialogue quality: reproducibility, diversity, hallucination, and character authenticity. Results show that SFT excels in reproducibility and hallucination reduction but tends to shorten and simplify outputs, thereby reducing diversity and authenticity. DPO improves stylistic fidelity and authenticity but depends strongly on externalized character knowledge to limit hallucinations. The hybrid variant that combines character-knowledge retrieval with DPO achieves the best overall balance, delivering strong authenticity while maintaining factual consistency and competitive reproducibility in both single- and multi-turn dialogues. We further analyze the model’s sensitivity to knowledge retrieval and response-length effects and discuss trade-offs among optimization targets that inform practical design choices for developing faithful and engaging character agents trained from scalable QA resources.
2024
Comparison of the Intimacy Process between Real and Acting-based Long-term Text Chats
Tsunehiro Arimoto | Hiroaki Sugiyama | Hiromi Narimatsu | Masahiro Mizukami
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Tsunehiro Arimoto | Hiroaki Sugiyama | Hiromi Narimatsu | Masahiro Mizukami
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Long-term chatbots are expected to develop relationships with users. The major trend in this field’s recent long-term chatbot studies is to train systems with virtual long-term chat data called Multi-Session Chat (MSC), which collects text chat from multiple sessions of crowd workers playing the roles of speakers with defined personas. However, no investigation has attempted to determine whether such virtual long-term chat can successfully simulate relationship-building between speakers. To clarify the difference between an actual long-term intimacy process and an MSC intimacy process, this study collects real long-term chat and MSC in Japanese and compares them in terms of speech form and dialogue acts. The results of analyzing these factors suggest that MSC have an unnatural tendency to behave as if they have a close relationship with non-polite speech levels compared to actual long-term chats, but also as if they have a shallow relationship with more questions than real long-term chats.
2020
Collection and Analysis of Dialogues Provided by Two Speakers Acting as One
Tsunehiro Arimoto | Ryuichiro Higashinaka | Kou Tanaka | Takahito Kawanishi | Hiroaki Sugiyama | Hiroshi Sawada | Hiroshi Ishiguro
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Tsunehiro Arimoto | Ryuichiro Higashinaka | Kou Tanaka | Takahito Kawanishi | Hiroaki Sugiyama | Hiroshi Sawada | Hiroshi Ishiguro
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
We are studying a cooperation style where multiple speakers can provide both advanced dialogue services and operator education. We focus on a style in which two operators interact with a user by pretending to be a single operator. For two operators to effectively act as one, each must adjust his/her conversational content and timing to the other. In the process, we expect each operator to experience the conversational content of his/her partner as if it were his/her own, creating efficient and effective learning of the other’s skill. We analyzed this educational effect and examined whether dialogue services can be successfully provided by collecting travel guidance dialogue data from operators who give travel information to users. In this paper, we report our preliminary results on dialogue content and user satisfaction of operators and users.