Shuhei Tateishi

2025

pdf bib abs
ACT: Knowledgeable Agents to Design and Perform Complex Tasks
Makoto Nakatsuji | Shuhei Tateishi | Yasuhiro Fujiwara | Ayaka Matsumoto | Narichika Nomoto | Yoshihide Sato
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models enhance collaborative task execution in multi-agent systems. Current studies break complex task into manageable tasks, but agents lack understanding of the overall task and how others approach their tasks, hindering synergy and integration.We propose a method called knowledgeable Agents to design and perform Complex Tasks (ACT), where: (1) Agents independently manage their knowledge and tasks while collaboratively design the complex task into a more comprehensible form. In parallel, each agent also acquires knowledge of others, defined as a structured description of how other agents approach their tasks based on the agent’s own task resolution. (2) Each agent updates its knowledge and refines its task through interactions with others. By referencing structured knowledge, they effectively integrate their tasks to collaboratively solve the complex task.Three evaluations including creative writing and tool utilization, show that ACT accurately outperforms existing methods in solving complex tasks.

2024

pdf bib abs
Word-Aware Modality Stimulation for Multimodal Fusion
Shuhei Tateishi | Yasuhito Osugi | Makoto Nakatsuji
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Multimodal learning is generally expected to make more accurate predictions than text-only analysis. Here, although various methods for fusing multimodal inputs have been proposed for sentiment analysis tasks, we found that they may be inhibiting their fusion methods, which are based on attention-based language models, from learning non-verbal modalities, because non-verbal ones are isolated from the linguistic semantics and contexts and do not include them, meaning that they are unsuitable for applying attention to text modalities during the fusion phase. To address this issue, we propose Word-aware Modality Stimulation Fusion (WA-MSF) for facilitating integration of non-verbal modalities with the text modality. The Modality Stimulation Unit layer (MSU-layer) is the core concept of WA-MSF; it integrates language contexts and semantics into non-verbal modalities, thereby instilling linguistic essence into these modalities. Moreover, WA-MSF uses aMLP in the fusion phase in order to utilize spatial and temporal representations of non-verbal modalities more effectively than transformer fusion. In our experiments, WA-MSF set a new state-of-the-art level of performance on sentiment prediction tasks.

Co-authors

Yasuhito Osugi 1

Venues

Fix author