Pré-apprentissage supervisé pour les réseaux profonds
Abstract
Gradient backpropagation works well only if the initial weights are close a good solution. Pretraining the Deep Neural Networks (DNNs) by autoassociators in a greedy way is a tricky way to set appropriate initializations in deep learning. While in the literature, the pretraining solely in-volve the inputs while the information conveyed by the la-bels is ignored. In this paper, we present new pretraining algorithms for DNNs by embedding the information of la-bels : the input and hidden layers' weights are initialized in the usual way by autoassociators. To set the initial values of the output layer, a autoassociator embedding the output vector into a particular space is learned. This space shares the dimension of the last hidden layer space which is set appropriatedly according to the output size. Empirical ev-idences show that initialization of the architecture rather than random initialization leads to better results in terms of generalization error.