Cyrillic-MNIST: a Cyrillic Version of the MNIST Dataset
Bolat Tleubayev, Zhanel Zhexenova, Kenessary Koishybay, Anara Sandygulova
Abstract
This paper presents a new handwritten dataset, Cyrillic-MNIST, a Cyrillic version of the MNIST dataset, comprising of 121,234 samples of 42 Cyrillic letters. The performance of Cyrillic-MNIST is evaluated using standard deep learning approaches and is compared to the Extended MNIST (EMNIST) dataset. The dataset is available at https://github.com/bolattleubayev/cmnist- Anthology ID:
- 2022.lrec-1.510
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 4767–4773
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.510
- DOI:
- Cite (ACL):
- Bolat Tleubayev, Zhanel Zhexenova, Kenessary Koishybay, and Anara Sandygulova. 2022. Cyrillic-MNIST: a Cyrillic Version of the MNIST Dataset. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4767–4773, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Cyrillic-MNIST: a Cyrillic Version of the MNIST Dataset (Tleubayev et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2022.lrec-1.510.pdf
- Data
- How2Sign