Daisuke Maekawa
2026
HOTATE: A Japanese Dialogue Corpus Annotated with Responses of Private Thoughts and Public Statements
Yuko Toda | Daisuke Maekawa | Kota Manabe | Eito Yoneyama | Kanade Nonomura | Yuki Fujiwara | Tomoyuki Kajiwara
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Yuko Toda | Daisuke Maekawa | Kota Manabe | Eito Yoneyama | Kanade Nonomura | Yuki Fujiwara | Tomoyuki Kajiwara
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This study aims to reveal how accurately Large Language Models (LLMs) can deal with a speaker’s actual utterances and their true feelings behind them in Japanese dialogue. Speakers use not only private thoughts which express one’s true feelings and intentions, but also public statements which convey their intentions while considering the interlocutor’s feelings and social status. While public statements help to maintain interpersonal relationships, they can obscure the speaker’s true intention, potentially leading to misunderstandings. We extended existing Japanese dialogue corpora by annotating public statements and private thoughts responses for each dialogue in the corpora, and then evaluated LLMs’ ability to classify and generate between these two types of expressions. The results of the classification task revealed that the current LLMs do not understand those expressions at all, and that training with our corpus can significantly improve the recognition performance. Furthermore, the results of the generation task demonstrated that generating private thoughts is more difficult than generating public statements, according to both automatic and human evaluations. We release our corpus, which contains 7,964 human-annotated dialogues.
Parallel Corpus Filtering Based on Semantic Similarity and Surface Dissimilarity for Japanese Text Simplification with LLMs
Daisuke Maekawa | Tomoyuki Kajiwara | Takashi Ninomiya
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Daisuke Maekawa | Tomoyuki Kajiwara | Takashi Ninomiya
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We are focusing on low-cost fine-tuning for large language models (LLMs) in Japanese text simplification. LLMs have achieved high performance even with fine-tuning on small parallel corpora in tasks such as machine translation and dialogue response generation. In this study, we propose a method of parallel corpus filtering for text simplification and investigate how much the number of sentence pairs for fine-tuning LLMs can be reduced. Experimental results on Japanese corpora in three domains revealed that the ability to perform text simplification tasks can be acquired even from a very small corpus of 16 to 64 sentence pairs. Although more parallel corpora are needed to acquire domain knowledge, our method outperformed full fine-tuning while reducing the training corpus by approximately 70%.
A Japanese Dataset for Aspect-based Sentiment Polarity Classification and Emotion Intensity Estimation
Kentaro Hanafusa | Kota Manabe | Yuki Maeda | Daisuke Maekawa | Tomoyuki Kajiwara | Hideaki Hayashi | Yuta Nakashima | Hajime Nagahara
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Kentaro Hanafusa | Kota Manabe | Yuki Maeda | Daisuke Maekawa | Tomoyuki Kajiwara | Hideaki Hayashi | Yuta Nakashima | Hajime Nagahara
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We manually construct and publicly release a Japanese dataset for Aspect-based Sentiment Analysis (ABSA), annotated with both sentiment polarity and the emotional intensities for Plutchik’s eight emotions. Existing datasets for Japanese ABSA only handle sentiment polarity classification. Therefore, we manually annotated Plutchik’s eight emotions with a four-point scale and sentiment polarity with a five-point scale to words in the Japanese sentiment analysis corpus WRIME. Analysis of this corpus revealed that word-level emotions more strongly reflect the reader’s objective impression than the writer’s subjective perspective. Furthermore, the results of evaluation experiments on word-level emotion estimation quantitatively demonstrated that while Large Language Models achieve high performance, they struggle with the estimation of the "trust" emotion. Additionally, we demonstrated that multi-task learning, utilizing both word and sentence levels, can improve performance on difficult-to-estimate subjective emotions.