Makoto Nakatsuji
2026
Multi-dimensional Evaluation of Character-Authentic Dialogue Models Learned from Question-Answer Data
Atsushi Otsuka | Kazuya Matsuo | Kenta Hama | Masahiro Mizukami | Tsunehiro Arimoto | Hiroaki Sugiyama | Makoto Nakatsuji | Narichika Nomoto
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Atsushi Otsuka | Kazuya Matsuo | Kenta Hama | Masahiro Mizukami | Tsunehiro Arimoto | Hiroaki Sugiyama | Makoto Nakatsuji | Narichika Nomoto
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Character-authentic dialogue remains challenging for large language models (LLMs) due to limited character-specific data, generic-style collapse, and hallucinations regarding persona facts. Our work presents a comparative evaluation of several learning strategies for character dialogue grounded in question–answer (QA) data, comparing zero/few-shot prompting, supervised fine-tuning (SFT), direct preference optimization (DPO), and a hybrid approach that integrates retrieval-augmented character profiles and knowledge with policy optimization. Using both single-turn and multi-turn settings, we assess multiple dimensions central to character dialogue quality: reproducibility, diversity, hallucination, and character authenticity. Results show that SFT excels in reproducibility and hallucination reduction but tends to shorten and simplify outputs, thereby reducing diversity and authenticity. DPO improves stylistic fidelity and authenticity but depends strongly on externalized character knowledge to limit hallucinations. The hybrid variant that combines character-knowledge retrieval with DPO achieves the best overall balance, delivering strong authenticity while maintaining factual consistency and competitive reproducibility in both single- and multi-turn dialogues. We further analyze the model’s sensitivity to knowledge retrieval and response-length effects and discuss trade-offs among optimization targets that inform practical design choices for developing faithful and engaging character agents trained from scalable QA resources.
Topic-Initiator: A Proactive Chatbot with Personalized Topic RAG for Enhancing Willingness to Converse
Kazuya Matsuo | Atsushi Otsuka | Narichika Nomoto | Makoto Nakatsuji
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Kazuya Matsuo | Atsushi Otsuka | Narichika Nomoto | Makoto Nakatsuji
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Stimulating users’ conversational willingness to converse remains a major challenge in chatbot research. Most existing chatbots respond passively to user inputs, relying on users to select conversation topics, which often reduces their willingness. To address this issue, we propose, Topic-Initiator, a proactive chatbot that initiates conversations with new topics aligned to user interests. It gathers information from external sources (e.g., the web) to obtain potentially novel and engaging topics. To support this capability, we also introduce a novel Retrieval-Augmented Generation (RAG) framework, Personalized-Topic RAG (PT-RAG), designed to retrieve new and interesting topics for each user. Unlike existing RAG methods that fails to surface unseen information, PT-RAG leverages the inference capabilities of Large Language Models (LLMs) to identify content that matches the user’s interests but is not yet known to them. Specifically, PT-RAG estimates a user’s interests and knowledge from past interactions and organizes collected information into categories. Then, it uses an LLM to select a category that matches their interests and obtain information not seen in their knowledge from the selected category. Automatic and human evaluations demonstrate that PT-RAG retrieves new and interesting information more accurately and that Topic-Initiator significantly enhances users’ willingness to converse compared to existing methods.
2025
ACT: Knowledgeable Agents to Design and Perform Complex Tasks
Makoto Nakatsuji | Shuhei Tateishi | Yasuhiro Fujiwara | Ayaka Matsumoto | Narichika Nomoto | Yoshihide Sato
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Makoto Nakatsuji | Shuhei Tateishi | Yasuhiro Fujiwara | Ayaka Matsumoto | Narichika Nomoto | Yoshihide Sato
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models enhance collaborative task execution in multi-agent systems. Current studies break complex task into manageable tasks, but agents lack understanding of the overall task and how others approach their tasks, hindering synergy and integration.We propose a method called knowledgeable Agents to design and perform Complex Tasks (ACT), where: (1) Agents independently manage their knowledge and tasks while collaboratively design the complex task into a more comprehensible form. In parallel, each agent also acquires knowledge of others, defined as a structured description of how other agents approach their tasks based on the agent’s own task resolution. (2) Each agent updates its knowledge and refines its task through interactions with others. By referencing structured knowledge, they effectively integrate their tasks to collaboratively solve the complex task.Three evaluations including creative writing and tool utilization, show that ACT accurately outperforms existing methods in solving complex tasks.
2024
Word-Aware Modality Stimulation for Multimodal Fusion
Shuhei Tateishi | Yasuhito Osugi | Makoto Nakatsuji
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Shuhei Tateishi | Yasuhito Osugi | Makoto Nakatsuji
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Multimodal learning is generally expected to make more accurate predictions than text-only analysis. Here, although various methods for fusing multimodal inputs have been proposed for sentiment analysis tasks, we found that they may be inhibiting their fusion methods, which are based on attention-based language models, from learning non-verbal modalities, because non-verbal ones are isolated from the linguistic semantics and contexts and do not include them, meaning that they are unsuitable for applying attention to text modalities during the fusion phase. To address this issue, we propose Word-aware Modality Stimulation Fusion (WA-MSF) for facilitating integration of non-verbal modalities with the text modality. The Modality Stimulation Unit layer (MSU-layer) is the core concept of WA-MSF; it integrates language contexts and semantics into non-verbal modalities, thereby instilling linguistic essence into these modalities. Moreover, WA-MSF uses aMLP in the fusion phase in order to utilize spatial and temporal representations of non-verbal modalities more effectively than transformer fusion. In our experiments, WA-MSF set a new state-of-the-art level of performance on sentiment prediction tasks.
2010
Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus -
Kyota Tsutsumida | Jun Okamoto | Shun Ishizaki | Makoto Nakatsuji | Akimichi Tanaka | Tadasu Uchiyama
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Kyota Tsutsumida | Jun Okamoto | Shun Ishizaki | Makoto Nakatsuji | Akimichi Tanaka | Tadasu Uchiyama
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
We propose a Word Sense Disambiguation (WSD) method that accurately classifies ambiguous words to concepts in the Associative Concept Dictionary (ACD) even when the test corpus and the training corpus for WSD are acquired from different domains. Many WSD studies determine the context of the target ambiguous word by analyzing sentences containing the target word. However, they offer poor performance when they are applied to a corpus that differs from the training corpus. One solution is to use associated words that are domain-independently assigned to the concept in ACD; i.e. many users commonly imagine those words against a given concept. Furthermore, by using the associated words of a concept as search queries for a training corpus, our method extracts relevant words, those that are computationally judged to be related to that concept. By checking the frequency of associated words and relevant words that appear near to the target word in a sentence in the test corpus, our method classifies the target word to the concept in ACD. Our evaluation using two different types of corpus demonstrates its good accuracy.