Hayato Ogawa


2026

Rap is a vocal style rooted in Hip-Hop culture, characterized by producing rhymes in synchrony with a rhythmic beat.This paper proposes a method for generating Japanese rap lyrics with a large language model (LLM) whose rhyming behavior is improved via reinforcement learning.We design a reward function that evaluates end rhymes between two generated bars and apply GRPO, a reinforcement-learning method, to encourage Japanese rhyming without using existing Japanese rap lyrics as training data.Experimental results show that, although output collapse is observed in some cases, GRPO increases the proportion of outputs that receive moderate or high human ratings on rhyme-related criteria.

2025

We develop an embedding model specifically designed for Waka poetry and use it to build a model for detecting Honkadori. Waka is a tradi-tional form of old Japanese poetry that has been composed since ancient times. Honkadori is a sophisticated poetic technique in Japanese clas-sical literature where poets incorporate words or poetic sentiments from old Wakas (Honka) into their own work. First, we fine-tune a pre-trained language model using contrastive learn-ing to construct a Waka-specialized embedding model. Then, using the embedding vectors ob-tained from this model and features extracted from them, we train a machine learning model to detect the Honka (original poem) of Wakas that employ the Honkadori technique. Using paired data of Honka and Wakas that are consid-ered to use Honkadori, we evaluated the Honka detection model and demonstrated that it can detect Honka with reasonable accuracy.
To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity.