Skip to Main content Skip to Navigation
Conference papers

Writing Type and Language Identification in Heterogeneous and Complex Documents

Abstract : This paper presents a system dedicated to automatic recognition of both the writing type and the language of text regions in heterogeneous and complex documents. This system is able to process documents with mixed printed and handwritten text, in various languages (French, English and Arabic). To handle such a problem, we divided it into two sub-tasks: The writing type identification and the language identification. The method for the writing type recognition is based on the analysis of the connected components while the language identification approach combines the analysis of connected components and the analysis of character distributions. We present the results obtained by the system during the second competition round of the MAURDOR campaign, and show that the performance of our system compares favorably with other participants.
Document type :
Conference papers
Complete list of metadatas

https://hal-normandie-univ.archives-ouvertes.fr/hal-02110368
Contributor : Sébastien Adam <>
Submitted on : Thursday, April 25, 2019 - 1:17:13 PM
Last modification on : Tuesday, April 21, 2020 - 10:04:20 AM

Identifiers

Citation

David Hebert, Phillipine Barlas, Clément Chatelain, Sébastien Adam, T. Paquet. Writing Type and Language Identification in Heterogeneous and Complex Documents. 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), Sep 2014, Heraklion, Greece. pp.411-416, ⟨10.1109/ICFHR.2014.75⟩. ⟨hal-02110368⟩

Share

Metrics

Record views

46