Kota Manabe

2026

This study aims to reveal how accurately Large Language Models (LLMs) can deal with a speaker’s actual utterances and their true feelings behind them in Japanese dialogue. Speakers use not only private thoughts which express one’s true feelings and intentions, but also public statements which convey their intentions while considering the interlocutor’s feelings and social status. While public statements help to maintain interpersonal relationships, they can obscure the speaker’s true intention, potentially leading to misunderstandings. We extended existing Japanese dialogue corpora by annotating public statements and private thoughts responses for each dialogue in the corpora, and then evaluated LLMs’ ability to classify and generate between these two types of expressions. The results of the classification task revealed that the current LLMs do not understand those expressions at all, and that training with our corpus can significantly improve the recognition performance. Furthermore, the results of the generation task demonstrated that generating private thoughts is more difficult than generating public statements, according to both automatic and human evaluations. We release our corpus, which contains 7,964 human-annotated dialogues.

pdf bib abs

Domain Adaptation of Image Encoder for Multimodal Manga Translation
Kota Manabe | Tomoyuki Kajiwara | Takashi Ninomiya | Isao Goto | Shonosuke Ishiwatari | Hiroshi Noji
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

The objective of this paper is to enhance machine translation for manga (Japanese comics) by developing and employing an image encoder that is capable of more accurately comprehending its visual context. Conventional manga machine translation systems have faced the challenge of lacking sufficient manga comprehension capabilities when utilizing image information. To address this issue, we propose a domain-adapted image encoder training method for manga. The proposed method involves training encoders to acquire visual features that consider the structural and sequential characteristics of the manga. This approach draws upon a technique that has proven to be highly effective in training language models. The image encoders trained by the proposed methods are used as visual processors in a multimodal machine translation model, and they are evaluated in a Japanese-English translation task. The experimental results demonstrate that the proposed method enhances the performance metrics for translation evaluation, such as BLEU and xCOMET, in comparison to the conventional method.

pdf bib abs

We manually construct and publicly release a Japanese dataset for Aspect-based Sentiment Analysis (ABSA), annotated with both sentiment polarity and the emotional intensities for Plutchik’s eight emotions. Existing datasets for Japanese ABSA only handle sentiment polarity classification. Therefore, we manually annotated Plutchik’s eight emotions with a four-point scale and sentiment polarity with a five-point scale to words in the Japanese sentiment analysis corpus WRIME. Analysis of this corpus revealed that word-level emotions more strongly reflect the reader’s objective impression than the writer’s subjective perspective. Furthermore, the results of evaluation experiments on word-level emotion estimation quantitatively demonstrated that while Large Language Models achieve high performance, they struggle with the estimation of the "trust" emotion. Additionally, we demonstrated that multi-task learning, utilizing both word and sentence levels, can improve performance on difficult-to-estimate subjective emotions.

Co-authors

Hideaki Hayashi 1

Shonosuke Ishiwatari 1

Venues

LREC2
EACL1

Fix author