Deep Total Variation Support Vector Networks

Support vector machines (SVMs) have been successful in solving many computer vision tasks including image and video category recognition especially for small and mid-scale training problems. The principle of these non-parametric models is to learn hyperplanes that separate data belonging to different classes while maximizing their margins. However, SVMs constrain the learned hyperplanes to lie in the span of support vectors, fixed/taken from training data, and this reduces their representational power and may lead to limited generalization performances. In this paper, we relax this constraint and allow the support vectors to be learned (instead of being fixed/taken from training data) in order to better fit a given classification task. Our approach, referred to as deep total variation support vector machines, is parametric and relies on a novel deep architecture that learns not only the SVM and the kernel parameters but also the support vectors, resulting into highly effective classifiers. We also show (under a particular setting of the activation functions in this deep architecture) that a large class of kernels and their combinations can be learned. Experiments conducted on the challenging task of skeleton-based action recognition show the outperformance of our deep total variation SVMs w.r.t different baselines as well as the related work.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Lorenzo Livi,et al.  Graph Neural Networks With Convolutional ARMA Filters , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[4]  Mooi Choo Chuah,et al.  Category-Blind Human Action Recognition: A Practical Recognition System , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Hichem Sahbi,et al.  Semi supervised deep kernel design for image annotation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Clément Mallet,et al.  INVESTIGATING THE POTENTIAL OF DEEP NEURAL NETWORKS FOR LARGE-SCALE CLASSIFICATION OF VERY HIGH RESOLUTION SATELLITE IMAGES , 2017 .

[7]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Hichem Sahbi,et al.  Deep kernel map networks for image annotation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[11]  Mehryar Mohri,et al.  Learning Non-Linear Combinations of Kernels , 2009, NIPS.

[12]  Hichem Sahbi,et al.  A Hierarchy of Support Vector Machines for Pattern Detection , 2006, J. Mach. Learn. Res..

[13]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[14]  Prasoon Goyal,et al.  Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[15]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[16]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  Cordelia Schmid,et al.  Convolutional Kernel Networks , 2014, NIPS.

[19]  Cristian Sminchisescu,et al.  Random Fourier Approximations for Skewed Multiplicative Histogram Kernels , 2010, DAGM-Symposium.

[20]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[22]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[23]  Ivor W. Tsang,et al.  Two-Layer Multiple Kernel Learning , 2011, AISTATS.

[24]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[25]  Boris Hanin,et al.  Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations , 2017, Mathematics.

[26]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[27]  C. Berg,et al.  Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .

[28]  Hichem Sahbi,et al.  Superpixel-based object class segmentation using conditional random fields , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[30]  I. J. Schoenberg,et al.  Metric spaces and positive definite functions , 1938 .

[31]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[34]  B. Thomee,et al.  Overview of the ImageCLEF 2013 Scalable Concept Image Annotation Subtask , 2013, CLEF.

[35]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[37]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[38]  Hichem Sahbi,et al.  CNRS - TELECOM ParisTech at ImageCLEF 2013 Scalable Concept Image Annotation Task: Winning Annotations with Context Dependent SVMs , 2013, CLEF.

[39]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[40]  Gang Wang,et al.  Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks , 2017, IEEE Transactions on Image Processing.

[41]  Dimitris Samaras,et al.  Two-person interaction detection using body-pose features and multiple instance learning , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[42]  Subhransu Maji,et al.  Efficient Classification for Additive Kernel SVMs , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Mehryar Mohri,et al.  Two-Stage Learning Kernel Algorithms , 2010, ICML.

[44]  Hichem Sahbi,et al.  Bags-of-daglets for action recognition , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[45]  Juan M. Corchado,et al.  Data-independent Random Projections from the feature-space of the homogeneous polynomial kernel , 2018, Pattern Recognit..

[46]  Stefano Berretti,et al.  A Novel Geometric Framework on Gram Matrix Trajectories for Human Behavior Understanding , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Hichem Sahbi,et al.  Nonlinear Deep Kernel Learning for Image Annotation , 2017, IEEE Transactions on Image Processing.

[48]  Hichem Sahbi,et al.  Context-dependent kernel design for object matching and recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Xiaohui Xie,et al.  Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks , 2016, AAAI.

[50]  Shyam Visweswaran,et al.  Deep Multiple Kernel Learning , 2013, 2013 12th International Conference on Machine Learning and Applications.

[51]  Hichem Sahbi,et al.  Applying interest operators in semi-fragile video watermarking , 2005, IS&T/SPIE Electronic Imaging.

[52]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.

[53]  Christian Wolf,et al.  Pose-conditioned Spatio-Temporal Attention for Human Action Recognition , 2017, ArXiv.

[54]  Hichem Sahbi,et al.  Mid-level features and spatio-temporal context for activity recognition , 2012, Pattern Recognit..

[55]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[56]  Hong Cheng,et al.  Interactive body part contrast mining for human interaction recognition , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[57]  Hichem Sahbi,et al.  Nonlinear Cross-View Sample Enrichment for Action Recognition , 2014, ECCV Workshops.

[58]  Po-Sen Huang,et al.  Random features for Kernel Deep Convex Network , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[59]  Hichem Sahbi,et al.  Context-Based Support Vector Machines for Interconnected Image Annotation , 2010, ACCV.

[60]  Nanning Zheng,et al.  View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[61]  Hichem Sahbi,et al.  Directed Acyclic Graph Kernels for Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[62]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[63]  Joseph J. LaViola,et al.  DeepGRU: Deep Gesture Recognition Utility , 2018, ISVC.

[64]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[65]  Hervé Glotin,et al.  A Comparative Study of Diversity Methods for Hybrid Text and Image Retrieval Approaches , 2008, CLEF.

[66]  Andrew Gordon Wilson,et al.  Deep Kernel Learning , 2015, AISTATS.

[67]  Xiaojun Qi,et al.  Incorporating multiple SVMs for automatic image annotation , 2007, Pattern Recognit..

[68]  Yihong Gong,et al.  Deep Learning with Kernel Regularization for Visual Recognition , 2008, NIPS.

[69]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[70]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[71]  Alexandros Iosifidis,et al.  Nyström-based approximate kernel subspace learning , 2016, Pattern Recognit..

[72]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[73]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.