Writing Type and Language Identification in Heterogeneous and Complex Documents - Archive ouverte HAL Access content directly
Conference Papers Year :

Writing Type and Language Identification in Heterogeneous and Complex Documents

Abstract

This paper presents a system dedicated to automatic recognition of both the writing type and the language of text regions in heterogeneous and complex documents. This system is able to process documents with mixed printed and handwritten text, in various languages (French, English and Arabic). To handle such a problem, we divided it into two sub-tasks: The writing type identification and the language identification. The method for the writing type recognition is based on the analysis of the connected components while the language identification approach combines the analysis of connected components and the analysis of character distributions. We present the results obtained by the system during the second competition round of the MAURDOR campaign, and show that the performance of our system compares favorably with other participants.
Not file

Dates and versions

hal-02110368 , version 1 (25-04-2019)

Identifiers

Cite

David Hebert, Phillipine Barlas, Clément Chatelain, Sébastien Adam, T. Paquet. Writing Type and Language Identification in Heterogeneous and Complex Documents. 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), Sep 2014, Heraklion, Greece. pp.411-416, ⟨10.1109/ICFHR.2014.75⟩. ⟨hal-02110368⟩
26 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More