Futo Kajita
2026
Presentation Slide Translation and Layout Error Correction by LLMs
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
We propose a novel approach to translating Japanese slides into English andto correcting their layout errors by utilizing multimodal LLMs with slide images and XML structures.Existing translation tools often suffer from layout errors after translationdue to text expansion during the translation process, causing text to overlap with figures or other items in slides and thereby reducing readability. To overcome this issue, our proposed framework introduces two steps consisting of (i) translating text fragments within the slide, and (ii) correcting layout errors by optimizing layout placement based on visual consistency. In step (ii), we empirically show that few-shot prompts are quite effective in layout error correction. Given that the optimal combination of steps (i) and (ii) varies depending on the slide layout, our method generates eight different layout candidates. Consequently, we introduce a third step that automatically selects the optimal output from these eight candidates.The experimental results showed that the proposed method outperformed baselines and achieved 4.1% layout error rate and over 80% model selection success rate.
2025
UTSK25 at WAT2025 Patent Claims Translation/Evaluation Task
Haruto Azami | Yin Zhang | Futo Kajita | Nobuyori Nishimura | Takehito Utsuro
Proceedings of the Twelfth Workshop on Asian Translation (WAT 2025)
Haruto Azami | Yin Zhang | Futo Kajita | Nobuyori Nishimura | Takehito Utsuro
Proceedings of the Twelfth Workshop on Asian Translation (WAT 2025)
This paper presents the submission of UTSK25 for the English–Japanese and Japanese–English at the WAT2025 Patent Claims Translation/Evaluation Task. We use a single translation model for both translation directions, built from a large language model through monolingual and bilingual continual pretraining and bilingual supervised fine-tuning. We finally generate translations via prompt engineering to reduce omissions and hallucinations.