Domain Adaptation of Image Encoder for Multimodal Manga Translation

Kota Manabe, Tomoyuki Kajiwara, Takashi Ninomiya, Isao Goto, Shonosuke Ishiwatari, Hiroshi Noji


Abstract
The objective of this paper is to enhance machine translation for manga (Japanese comics) by developing and employing an image encoder that is capable of more accurately comprehending its visual context. Conventional manga machine translation systems have faced the challenge of lacking sufficient manga comprehension capabilities when utilizing image information. To address this issue, we propose a domain-adapted image encoder training method for manga. The proposed method involves training encoders to acquire visual features that consider the structural and sequential characteristics of the manga. This approach draws upon a technique that has proven to be highly effective in training language models. The image encoders trained by the proposed methods are used as visual processors in a multimodal machine translation model, and they are evaluated in a Japanese-English translation task. The experimental results demonstrate that the proposed method enhances the performance metrics for translation evaluation, such as BLEU and xCOMET, in comparison to the conventional method.
Anthology ID:
2026.eacl-srw.3
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Selene Baez Santamaria, Sai Ashish Somayajula, Atsuki Yamaguchi
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17–26
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.3/
DOI:
Bibkey:
Cite (ACL):
Kota Manabe, Tomoyuki Kajiwara, Takashi Ninomiya, Isao Goto, Shonosuke Ishiwatari, and Hiroshi Noji. 2026. Domain Adaptation of Image Encoder for Multimodal Manga Translation. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 17–26, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Domain Adaptation of Image Encoder for Multimodal Manga Translation (Manabe et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.3.pdf