Challenges in Deep Learning

In recent years, Deep Learning methods and architectures have reached impressive results, allowing quantum-leap improvements in performance in many difficult tasks, such as speech recognition, end-to-end machine translation, image classification/understanding, just to name a few. After a brief introduction to some of the main achievements of Deep Learning, we discuss what we think are the general challenges that should be addressed in the future. We close with a review of the contributions to the ESANN 2016 special session on Deep Learning.

[1]  Alessio Micheli,et al.  A general framework for unsupervised processing of structured data , 2004, Neurocomputing.

[2]  Alessandro Sperduti,et al.  Neural Networks for Sequential Data: a Pre-training Approach based on Hidden Markov Models , 2015, Neurocomputing.

[3]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[4]  Alessandro Sperduti,et al.  Equivalence Results between Feedforward and Recurrent Neural Networks for Sequences , 2015, IJCAI.

[5]  Benjamin Schrauwen,et al.  Training and Analysing Deep Recurrent Neural Networks , 2013, NIPS.

[6]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[7]  Tommi Kärkkäinen,et al.  Comparison of Four- and Six-Layered Configurations for Deep Network Pretraining , 2016, ESANN.

[8]  Franco Scarselli,et al.  Learning long-term dependencies using layered graph neural networks , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[9]  Héctor F. Satizábal,et al.  Augmenting a convolutional neural network with local histograms - A case study in crop classification from high-resolution UAV imagery , 2016, ESANN.

[10]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[11]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[12]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[13]  Aníbal R. Figueiras-Vidal,et al.  An Experiment in Pre-Emphasizing Diversified Deep Neural Classifiers , 2016, ESANN.

[14]  Yann LeCun,et al.  Deep learning with Elastic Averaging SGD , 2014, NIPS.

[15]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[16]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[17]  Alessandro Sperduti,et al.  Pre-training of Recurrent Neural Networks via Linear Autoencoders , 2014, NIPS.

[18]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[19]  Plamen Angelov,et al.  A general purpose intelligent surveillance system for mobile devices using Deep Learning , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[20]  Sven Behnke,et al.  Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks , 2016, ESANN.

[21]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[22]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[23]  Alessandro Sperduti,et al.  A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.

[24]  Ludovic Denoyer,et al.  Learning Embeddings for Completion and Prediction of Relationnal Multivariate Time-Series , 2016, ESANN.

[25]  Tapani Raiko,et al.  Stochastic gradient estimate variance in contrastive divergence and persistent contrastive divergence , 2016, ESANN.

[26]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[27]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[28]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[29]  Alessandro Sperduti,et al.  Extended Cascade-Correlation for Syntactic and Structural Pattern Recognition , 1996, SSPR.

[30]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[31]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[32]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[33]  Sander Dieleman,et al.  Spatial Chirp-Z Transformer Networks , 2016, ESANN.

[34]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[35]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[36]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Aaron C. Courville,et al.  Deep Learning Vector Quantization , 2016, ESANN.

[39]  Richard G. Baraniuk,et al.  A Probabilistic Theory of Deep Learning , 2015, ArXiv.

[40]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[41]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[42]  Claudio Gallicchio,et al.  Deep Reservoir Computing: A Critical Analysis , 2016, ESANN.

[43]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[44]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[45]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[46]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[47]  Alessio Micheli,et al.  Neural Network for Graphs: A Contextual Constructive Approach , 2009, IEEE Transactions on Neural Networks.

[48]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.