Chee Siang Leow
2026
Presentation Slide Translation and Layout Error Correction by LLMs
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Futo Kajita | Nobuyori Nishimura | Takehito Utsuro | Naoki Muto | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
We propose a novel approach to translating Japanese slides into English andto correcting their layout errors by utilizing multimodal LLMs with slide images and XML structures.Existing translation tools often suffer from layout errors after translationdue to text expansion during the translation process, causing text to overlap with figures or other items in slides and thereby reducing readability. To overcome this issue, our proposed framework introduces two steps consisting of (i) translating text fragments within the slide, and (ii) correcting layout errors by optimizing layout placement based on visual consistency. In step (ii), we empirically show that few-shot prompts are quite effective in layout error correction. Given that the optimal combination of steps (i) and (ii) varies depending on the slide layout, our method generates eight different layout candidates. Consequently, we introduce a third step that automatically selects the optimal output from these eight candidates.The experimental results showed that the proposed method outperformed baselines and achieved 4.1% layout error rate and over 80% model selection success rate.
2022
Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training
Tomoki Kitagawa | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Tomoki Kitagawa | Chee Siang Leow | Hiromitsu Nishizaki
Proceedings of the Thirteenth Language Resources and Evaluation Conference
It is well-known that the deep learning-based optical character recognition (OCR) system needs a large amount of data to train a high-performance character recognizer. However, it is costly to collect a large amount of realistic handwritten characters. This paper introduces a Y-Autoencoder (Y-AE)-based handwritten character generator to generate multiple Japanese Hiragana characters with a single image to increase the amount of data for training a handwritten character recognizer. The adaptive instance normalization (AdaIN) layer allows the generator to be trained and generate handwritten character images without paired-character image labels. The experiment shows that the Y-AE could generate Japanese character images then used to train the handwritten character recognizer, producing an F1-score improved from 0.8664 to 0.9281. We further analyzed the usefulness of the Y-AE-based generator with shape images, out-of-character (OOC) images, which have different character images styles in model training. The result showed that the generator could generate a handwritten image with a similar style to that of the input character.