GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer

Md. Mahmudul Hasan, Ahmed Nesar Tahsin Choudhury, Mahmudul Hasan, Md Mosaddek Khan


Abstract
Despite Bengali being the sixth most spoken language in the world, handwritten text recognition (HTR) systems for Bengali remain severely underdeveloped. The complexity of Bengali script—featuring conjuncts, diacritics, and highly variable handwriting styles—combined with a scarcity of annotated datasets makes this task particularly challenging. We present **GraDeT-HTR**, a resource-efficient Bengali handwritten text recognition system based on a **Gra**pheme-aware **De**coder-only **T**ransformer architecture. To address the unique challenges of Bengali script, we augment the performance of a decoder-only transformer by integrating a grapheme-based tokenizer and demonstrate that it significantly improves recognition accuracy compared to conventional subword tokenizers. Our model is pretrained on large-scale synthetic data and fine-tuned on real human-annotated samples, achieving state-of-the-art performance on multiple benchmark datasets.
Anthology ID:
2025.emnlp-demos.52
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Ivan Habernal, Peter Schulam, Jörg Tiedemann
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
696–706
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-demos.52/
DOI:
Bibkey:
Cite (ACL):
Md. Mahmudul Hasan, Ahmed Nesar Tahsin Choudhury, Mahmudul Hasan, and Md Mosaddek Khan. 2025. GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 696–706, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer (Hasan et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-demos.52.pdf