Naoki Muto
2026
Presentation Slide Translation and Layout Error Correction by LLMs
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
We propose a novel approach to translating Japanese slides into English andto correcting their layout errors by utilizing multimodal LLMs with slide images and XML structures.Existing translation tools often suffer from layout errors after translationdue to text expansion during the translation process, causing text to overlap with figures or other items in slides and thereby reducing readability. To overcome this issue, our proposed framework introduces two steps consisting of (i) translating text fragments within the slide, and (ii) correcting layout errors by optimizing layout placement based on visual consistency. In step (ii), we empirically show that few-shot prompts are quite effective in layout error correction. Given that the optimal combination of steps (i) and (ii) varies depending on the slide layout, our method generates eight different layout candidates. Consequently, we introduce a third step that automatically selects the optimal output from these eight candidates.The experimental results showed that the proposed method outperformed baselines and achieved 4.1% layout error rate and over 80% model selection success rate.