Kazuya Matsuo


2026

Character-authentic dialogue remains challenging for large language models (LLMs) due to limited character-specific data, generic-style collapse, and hallucinations regarding persona facts. Our work presents a comparative evaluation of several learning strategies for character dialogue grounded in question–answer (QA) data, comparing zero/few-shot prompting, supervised fine-tuning (SFT), direct preference optimization (DPO), and a hybrid approach that integrates retrieval-augmented character profiles and knowledge with policy optimization. Using both single-turn and multi-turn settings, we assess multiple dimensions central to character dialogue quality: reproducibility, diversity, hallucination, and character authenticity. Results show that SFT excels in reproducibility and hallucination reduction but tends to shorten and simplify outputs, thereby reducing diversity and authenticity. DPO improves stylistic fidelity and authenticity but depends strongly on externalized character knowledge to limit hallucinations. The hybrid variant that combines character-knowledge retrieval with DPO achieves the best overall balance, delivering strong authenticity while maintaining factual consistency and competitive reproducibility in both single- and multi-turn dialogues. We further analyze the model’s sensitivity to knowledge retrieval and response-length effects and discuss trade-offs among optimization targets that inform practical design choices for developing faithful and engaging character agents trained from scalable QA resources.
Stimulating users’ conversational willingness to converse remains a major challenge in chatbot research. Most existing chatbots respond passively to user inputs, relying on users to select conversation topics, which often reduces their willingness. To address this issue, we propose, Topic-Initiator, a proactive chatbot that initiates conversations with new topics aligned to user interests. It gathers information from external sources (e.g., the web) to obtain potentially novel and engaging topics. To support this capability, we also introduce a novel Retrieval-Augmented Generation (RAG) framework, Personalized-Topic RAG (PT-RAG), designed to retrieve new and interesting topics for each user. Unlike existing RAG methods that fails to surface unseen information, PT-RAG leverages the inference capabilities of Large Language Models (LLMs) to identify content that matches the user’s interests but is not yet known to them. Specifically, PT-RAG estimates a user’s interests and knowledge from past interactions and organizes collected information into categories. Then, it uses an LLM to select a category that matches their interests and obtain information not seen in their knowledge from the selected category. Automatic and human evaluations demonstrate that PT-RAG retrieves new and interesting information more accurately and that Topic-Initiator significantly enhances users’ willingness to converse compared to existing methods.