Handwriting Recognition with Multigrams - Normandie Université Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Handwriting Recognition with Multigrams

Thierry Paquet
Yann Soullard
  • Fonction : Auteur
  • PersonId : 980634
Pierrick Tranouez

Résumé

We introduce a novel handwriting recognition approach based on sub-lexical units known as multigrams of characters, that are variable lengths characters sequences. A Hidden Semi Markov model is used to model the multigrams occurrences within the target language corpus. Decoding the training language corpus with this model provides an optimized multigram lexicon of reduced size with high coverage rate of OOV compared to the traditional word modeling approach. The handwriting recognition system is composed of two components: the optical model and the statistical n-grams of multigrams language model. The two models are combined together during the recognition process using a decoding technique based on Weighted Finite State Transducers (WFST). We experiment the approach on two Latin language datasets (the French RIMES and English IAM datasets) and we show that it outperforms words and character models language models for high Out Of Vocabulary (OOV) words rates, and that it performs similarly to these traditional models for low OOV rates, with the advantage of a reduced complexity.
Fichier non déposé

Dates et versions

hal-02075753 , version 1 (21-03-2019)

Identifiants

Citer

Wassim Swaileh, Thierry Paquet, Yann Soullard, Pierrick Tranouez. Handwriting Recognition with Multigrams. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Nov 2017, Kyoto, Japan. pp.137-142, ⟨10.1109/ICDAR.2017.31⟩. ⟨hal-02075753⟩
32 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More