Skip to Main content Skip to Navigation
Journal articles

A unified multilingual handwriting recognition system using multigrams sub-lexical units

Wassim Swaileh 1 yann Soullard 1 Thierry Paquet 1 
1 DocApp - LITIS - Equipe Apprentissage
LITIS - Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes
Abstract : We address the design of a unified multilingual system for handwriting recognition. Most of multilingual systems rests on specialized models that are trained on a single language and one of them is selected at test time. While some recognition systems are based on a unified optical model, dealing with a unified language model remains a major issue, as traditional language models are generally trained on corpora composed of large word lexicons per language. Here, we bring a solution by considering language models based on sub-lexical units, called multigrams. Dealing with multigrams strongly reduces the lexicon size and thus decreases the language model complexity. This makes possible the design of an end-to-end unified multilingual recognition system where both a single optical model and a single language model are trained on all the languages. We discuss the impact of the language unification on each model and show that our system reaches state-of-the-art methods performance with a strong reduction of the complexity.
Document type :
Journal articles
Complete list of metadata

https://hal-normandie-univ.archives-ouvertes.fr/hal-02075654
Contributor : Accord Elsevier CCSD Connect in order to contact the contributor
Submitted on : Friday, October 22, 2021 - 11:27:54 AM
Last modification on : Wednesday, March 2, 2022 - 10:10:12 AM
Long-term archiving on: : Sunday, January 23, 2022 - 7:18:54 PM

File

S0167865518303271.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License

Identifiers

Citation

Wassim Swaileh, yann Soullard, Thierry Paquet. A unified multilingual handwriting recognition system using multigrams sub-lexical units. Pattern Recognition Letters, Elsevier, 2019, 121, pp.68-76. ⟨10.1016/j.patrec.2018.07.027⟩. ⟨hal-02075654⟩

Share

Metrics

Record views

46

Files downloads

10