A Uniﬁed Neural Based Model for Structured Output Problems

Many recent applications address challenging problems where the output is in high dimension (discrete Structured output problems are characterized by struc- or continuous values) and where dependencies lie betural dependencies between the outputs (e.g. the classes tween these outputs. These dependencies constitute a distribution in image labeling problem, the words posi- structure (sequences, strings, trees, graphs . . . ) which tions in sequence tagging in natural language process- should be either discovered if unknown, or integrated ing). Traditionally, graphical models such as HMM and in the learning algorithm. CRF are used to capture the interdependencies of the The range of applications that deal with structured outputs. In this article, we propose a uniﬁed framework output data is large. Statistical Natural Language to deal with this problem where we combine learning the Processing (NLP) is an application where the outhidden interdependencies of the inputs and the outputs put has a speciﬁed structure, such as in (i) machine in the same optimization. In our framework, we extend translation[Och03] where the output is a sentence, (ii) the input pre-training layer technique for deep neural sentence parsing[ST95] where the output is a parse tree, networks to pre-train the output layers aiming at learn- or (iii) part of speech tagging [Sch94] where the output ing the outputs structure. We propose a neural based is a sequence of tags. Bioinformatics also manipulates model, called Input/Output Deep Architecture (IODA) structured output data, such as in secondary structure to solve the optimization. Facial landmark detection is prediction of proteins[Jon99] where the output is a sea real-world application where the output key points quence modeled by a bipartite graph, or in enzyme of the face shape have an obvious geometric structure function prediction[SY09] where the output is a path dependencies. We perform an evaluation of IODA on in a tree. In speech processing we ﬁnd speech recognithis task over two challenging datasets: LFPW and HE- tion (speech-to-text) [Rab89] where the prediction is a LEN. We demonstrate that IODA outperforms a deep sentence, and text-to-speech synthesis [ZTB09] where network with the traditional pre-training technique. the output is an audio signal.

Domaines

Machine Learning [stat.ML]

Romain Hérault : Connectez-vous pour contacter le contributeur

https://normandie-univ.hal.science/hal-02346189

Soumis le : lundi 4 novembre 2019-19:22:03

Dernière modification le : vendredi 22 décembre 2023-15:16:05

Dates et versions

hal-02346189 , version 1 (04-11-2019)

Identifiants

HAL Id : hal-02346189 , version 1

Citer

Soufiane Belharbi, Clément Chatelain, Romain Hérault, Sébastien Adam. A Uniﬁed Neural Based Model for Structured Output Problems. Conférence sur l'APprentissage automatique, Jul 2015, Villeneuve d'Ascq, France. ⟨hal-02346189⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE UNIROUEN UNILEHAVRE INSA-GROUPE

18 Consultations

0 Téléchargements