Abstract
It is well-known that the deep learning-based optical character recognition (OCR) system needs a large amount of data to train a high-performance character recognizer. However, it is costly to collect a large amount of realistic handwritten characters. This paper introduces a Y-Autoencoder (Y-AE)-based handwritten character generator to generate multiple Japanese Hiragana characters with a single image to increase the amount of data for training a handwritten character recognizer. The adaptive instance normalization (AdaIN) layer allows the generator to be trained and generate handwritten character images without paired-character image labels. The experiment shows that the Y-AE could generate Japanese character images then used to train the handwritten character recognizer, producing an F1-score improved from 0.8664 to 0.9281. We further analyzed the usefulness of the Y-AE-based generator with shape images, out-of-character (OOC) images, which have different character images styles in model training. The result showed that the generator could generate a handwritten image with a similar style to that of the input character.- Anthology ID:
- 2022.lrec-1.799
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 7344–7351
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.799
- DOI:
- Cite (ACL):
- Tomoki Kitagawa, Chee Siang Leow, and Hiromitsu Nishizaki. 2022. Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7344–7351, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training (Kitagawa et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.lrec-1.799.pdf