Marginal Deep Architectures: Deep learning for Small and Middle Scale Applications

In recent years, many deep architectures have been proposed in different fields. However, to obtain good results, most of the previous deep models need a large number of training data. In this paper, for small and middle scale applications, we propose a novel deep learning framework based on stacked feature learning models. Particularly, we stack marginal Fisher analysis (MFA) layer by layer for the initialization of the deep architecture and call it “Marginal Deep Architectures” (MDA). In the implementation of MDA, the weight matrices of MFA are first learned layer by layer, and then we exploit some deep learning techniques, such as back propagation, dropout and denoising to fine tune the network. To evaluate the effectiveness of MDA, we have compared it with some feature learning methods and deep learning models on 7 small and middle scale real-world applications, including handwritten digits recognition, speech recognition, historical document understanding, image classification, action recognition and so on. Extensive experiments demonstrate that MDA performs not only better than shallow feature learning models, but also state-of-the-art deep learning models in these applications.

[1]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[2]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[3]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[4]  Quoc V. Le,et al.  Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[5]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[6]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[7]  I. Jolliffe Principal Component Analysis , 2002 .

[8]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[9]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[10]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[11]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[12]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[13]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[14]  Mohamed Cheriet,et al.  An Empirical Evaluation of Supervised Dimensionality Reduction for Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15]  Junyu Dong,et al.  Visual Texture Perception with Feature Learning Models and Deep Architectures , 2014, CCPR.

[16]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[17]  Junyu Dong,et al.  Stretching deep architectures for text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[18]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[19]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[20]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[21]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[22]  George Trigeorgis,et al.  A Deep Semi-NMF Model for Learning Hidden Representations , 2014, ICML.

[23]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[24]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[25]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.