论文信息 - Deep Sequential Neural Networks

Deep Sequential Neural Networks

Neural Networks sequentially build high-level features through their successive layers. We propose here a new neural network model where each layer is associated with a set of candidate mappings. When an input is processed, at each layer, one mapping among these candidates is selected according to a sequential decision process. The resulting model is structured according to a DAG like architecture, so that a path from the root to a leaf node defines a sequence of transformations. Instead of considering global transformations, like in classical multilayer networks, this model allows us for learning a set of local transformations. It is thus able to process data with different characteristics through specific sequences of such local transformations, increasing the expression power of this model w.r.t a classical multilayered network. The learning algorithm is inspired from policy gradient techniques coming from the reinforcement learning domain and is used here instead of the classical back-propagation based gradient descent techniques. Experiments on different datasets show the relevance of this approach.

Ludovic Denoyer

[1] J. R. Quinlan. Induction of decision trees , 2004, Machine Learning.

[2] Paul E. Utgoff,et al. Perceptron Trees : A Case Study in ybrid Concept epresentations , 1999 .

[3] Jean-Pierre Nadal,et al. Neural trees: a new tool for classification , 1990 .

[4] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[5] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.

[6] Russell Greiner,et al. Learning to segment from a few well-selected training images , 2009, ICML '09.

[7] Patrick Gallinari,et al. Text Classification: A Sequential Reading Approach , 2011, ECIR.

[8] Trevor Darrell,et al. Timely Object Recognition , 2012, NIPS.

[9] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Yoshua Bengio,et al. Deep Learning for NLP (without Magic) , 2012, ACL.

[11] Balázs Kégl,et al. Fast classification using sparse decision DAGs , 2012, ICML.

[12] Patrick Gallinari,et al. Sequential approaches for learning datum-wise sparse representations , 2012, Machine Learning.

[13] Ilya Sutskever,et al. Training Deep and Recurrent Networks with Hessian-Free Optimization , 2012, Neural Networks: Tricks of the Trade.

[14] Christopher D. Manning,et al. Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[15] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16] Pierre Baldi,et al. The dropout learning algorithm , 2014, Artif. Intell..

[17] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[18] Matthieu Cord,et al. Sequentially Generated Instance-Dependent Image Representations for Classification , 2014, ICLR.