Deep multiple multilayer kernel learning in core vector machines

Abstract Over the last few years, we have been witnessing a dramatic progress of deep learning in many real world applications. Deep learning concepts have been originated in the area of neural network and show a quantum leap in effective feature learning techniques such as auto-encoders, convolutional neural networks, recurrent neural networks etc. In the case of kernel machines, there are several attempts to model learning machines that mimic deep neural networks. In this direction, Multilayer Kernel Machines (MKMs) was an attempt to build a kernel machine architecture with multiple layers of feature extraction. It composed of many layers of kernel PCA based feature extraction units with support vector machine having arc-cosine kernel as the final classifier. The other approaches like Multiple Kernel Learning (MKL) and deep core vector machines solve the fixed kernel computation problem and scalability aspects of MKMs respectively. In addition to this, there are lot of avenues where the use of unsupervised MKL with both single and multilayer kernels in the multilayer feature extraction framework have to be evaluated. In this context, this paper attempts to build a scalable deep kernel machines with multiple layers of feature extraction. Each kernel PCA based feature extraction layer in this framework is modeled by the combination of both single and multilayer kernels in an unsupervised manner. Core vector machine with arc-cosine kernel is used as the final layer classifier which ensure the scalability in this model. The major contribution of this paper is a novel effort to build a deep structured kernel machine architecture similar to deep neural network architecture for classification. It opens up an extendable research avenue for researchers in deep learning based intelligent system leveraging the principles of kernel theory. Experiments show that the proposed method consistently improves the generalization performances of existing deep core vector machine.

[1]  Jacek M. Zurada,et al.  Generalized Core Vector Machines , 2006, IEEE Transactions on Neural Networks.

[2]  Andrzej Cichocki,et al.  Kernel PCA for Feature Extraction and De-Noising in Nonlinear Regression , 2001, Neural Computing & Applications.

[3]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew Gordon Wilson,et al.  Deep Kernel Learning , 2015, AISTATS.

[5]  Erik Marchi,et al.  Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[6]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Lawrence K. Saul,et al.  Analysis and Extension of Arc-Cosine Kernels for Large Margin Classification , 2011, ArXiv.

[10]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[11]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[12]  M. Narasimha Murty,et al.  Cluster Based Core Vector Machine , 2006, Sixth International Conference on Data Mining (ICDM'06).

[13]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[14]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[15]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[16]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[17]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[18]  Walid Mahdi,et al.  Deep multilayer multiple kernel learning , 2016, Neural Computing and Applications.

[19]  Shiliang Sun,et al.  Multitask multiclass support vector machines: Model and experiments , 2013, Pattern Recognit..

[20]  S. Asharaf,et al.  Deep kernel learning in core vector machines , 2017, Pattern Analysis and Applications.

[21]  Steven C. H. Hoi,et al.  Unsupervised Multiple Kernel Learning , 2011, ACML.

[22]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[23]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[24]  Florian Metze,et al.  Deep maxout networks for low-resource speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[25]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[26]  Y. Liu,et al.  Bilinear deep learning for image classification , 2011, ACM Multimedia.

[27]  M. Narasimha Murty,et al.  Multiclass core vector machine , 2007, ICML '07.

[28]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[29]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[30]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[31]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[32]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..