Deep Network with Support Vector Machines

Deep learning methods aims at learning features automatically at multiple levels that allow the system to learn complex functions mapping the input to the output directly from data. This ability to automatically learn powerful features will become increasingly important as the amount of data and range of applications to machine learning methods continues to grow. In this context we propose a deep architecture model using Support Vector Machine (SVM) which has inherent ability to select data points important for classification with good generalization capabilities. Since SVM can effectively discriminate features, we used support vectors with kernel as non-linear discriminant features for classification. By stacking SVMs in to multiple layers, we can obtain deep features without extra feature engineering steps and get robust recognition accuracy. Experimental results show that the proposed method improves generalization performance on Wisconsin Breast Cancer dataset.

[1]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[2]  Paul E. Utgoff,et al.  Many-Layered Learning , 2002, Neural Computation.

[3]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[4]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[7]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[8]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[9]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[10]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[11]  Gerald Tesauro,et al.  Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[12]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[13]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[14]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.