Unconstrained Bengali handwriting recognition with recurrent models - Normandie Université Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Unconstrained Bengali handwriting recognition with recurrent models

Résumé

This paper presents a pioneering attempt for developing a recurrent neural net based connectionist system for unconstrained Bengali offline handwriting recognition. The major challenge in configuring such a classification system for a complex script like Bengali is to effectively define the character classes. A novel way of defining character classes is introduced making the recognition problem suitable for using a recurrent model. Indeed, it has to deal with more than nine hundred character classes for which the occurrence probability is very skewed in the language. An off-the-shelf BLSTM-CTC recognizer is used. An open-source dataset is developed for unconstrained Bengali offline handwriting recognition. The dataset contains 2,338 handwritten text lines consisting of about 21,000 word. Experiment shows that with the new definition of character classes the BLSTM-CTC provides an impressive performance for unconstrained Bengali offline handwriting recognition. The character level recognition accuracy is 75.40% without doing any post-processing on the BLSTM-CTC output. Among the 24.60% character level errors, the substitution, deletion and insertion errors are 18.91%, 4.69% and 0.98%, respectively.
Fichier non déposé

Dates et versions

hal-02087603 , version 1 (02-04-2019)

Identifiants

Citer

Utpal Garain, Luc Mioulet, B. Chaudhuri, Clément Chatelain, Thierry Paquet. Unconstrained Bengali handwriting recognition with recurrent models. 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Aug 2015, Tunis, Tunisia. pp.1056-1060, ⟨10.1109/ICDAR.2015.7333923⟩. ⟨hal-02087603⟩
22 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More