Tutorial on Deep Learning and Applications

Nowadays artificial intelligence (AI) has come to the realization through advancement in multidisciplinary research. Big training data and GPU allow us to apply deep learning in industry field. The remarkable progress of machine learning, data mining and information theory algorithm have been showing outstanding results beyond the human ability in some area, such as image recognition, speech recognition. In this study, we present the background and some historical context of artificial intelligence so far. And then we introduce the trend of deep learning and its application. Especially convolutional neural networks (CNN) [1], recurrent neural networks (RNN) [2], generative model and multimodal learning which lead the latest trend are explained in detail. Lastly, we discuss the future direction of deep learning and a breakthrough in third AI winter.

[1]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[2]  Geoffrey E. Hinton,et al.  Deep Belief Networks for phone recognition , 2009 .

[3]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[4]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[5]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[6]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[7]  Geoffrey E. Hinton,et al.  Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images , 2010, AISTATS.

[8]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[9]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[10]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[11]  Geoffrey E. Hinton,et al.  Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.

[12]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[13]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[15]  Thomas S. Huang,et al.  Supervised translation-invariant sparse coding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[17]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[18]  Rajat Raina,et al.  Self-taught learning , 2009 .

[19]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[21]  Geoffrey E. Hinton,et al.  Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.

[22]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[23]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[25]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[26]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[27]  Yann LeCun,et al.  Convolutional Learning of Spatio-temporal Features , 2010, ECCV.

[28]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[30]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[31]  R. Fergus,et al.  Learning invariant features through topographic filter maps , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[33]  Marc'Aurelio Ranzato,et al.  Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.