Shrinkage learning to improve SVM with hints

The Support Vector Machine (SVM) is one of the most effective and used algorithms, when targeting classification. Despite its large success, SVM is mainly afflicted by two issues: (i) some hyperparameters must be tuned in advance and are, in practice, identified through computationally intensive procedures; (ii) possible a-priori knowledge about the problem (e.g. doctor expertise in medical applications) cannot be straightforwardly exploited. In this paper, we introduce a new approach, able to cope with the two previous problems: several experiments, performed on real-world benchmarking datasets, show that our method outperforms, on average, other techniques proposed in the literature.

[1]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[2]  Sanjeev R. Kulkarni,et al.  Learning Pattern Classification - A Survey , 1998, IEEE Trans. Inf. Theory.

[3]  Shiliang Sun,et al.  PAC-bayes bounds with data dependent priors , 2012, J. Mach. Learn. Res..

[4]  Padhraic Smyth,et al.  Linearly Combining Density Estimators via Stacking , 1999, Machine Learning.

[5]  Marcos M. Campos,et al.  SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines , 2005, VLDB.

[6]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[7]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[8]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[9]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[10]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[11]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[14]  Davide Anguita,et al.  The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers , 2011, NIPS.

[15]  Cecilio Angulo,et al.  A probabilistic tri-class support vector machine , 2010 .

[16]  Davide Anguita,et al.  Energy Efficient Smartphone-Based Activity Recognition using Fixed-Point Arithmetic , 2013, J. Univers. Comput. Sci..

[17]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[18]  Denis J. Dean,et al.  Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , 1999 .

[19]  S. Sathiya Keerthi,et al.  Parallel sequential minimal optimization for the training of support vector machines , 2006, IEEE Trans. Neural Networks.

[20]  Yan Liu,et al.  Learning with Minimum Supervision: A General Framework for Transductive Transfer Learning , 2011, 2011 IEEE 11th International Conference on Data Mining.

[21]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[22]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[23]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[24]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[25]  Davide Anguita,et al.  In-sample model selection for Support Vector Machines , 2011, The 2011 International Joint Conference on Neural Networks.

[26]  Davide Anguita,et al.  Maximal Discrepancy vs. Rademacher Complexity for error estimation , 2011, ESANN.

[27]  John Shawe-Taylor,et al.  Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[28]  Davide Anguita,et al.  Selecting the hypothesis space for improving the generalization ability of Support Vector Machines , 2011, The 2011 International Joint Conference on Neural Networks.

[29]  Davide Anguita,et al.  Unlabeled patterns to tighten Rademacher complexity error bounds for kernel classifiers , 2014, Pattern Recognit. Lett..

[30]  Davide Anguita,et al.  Energy Efficient Smartphone-Based Activity Recognition Using Fixed-Point Arithmetic , 2013 .

[31]  Milos Hauskrecht,et al.  Learning Classification with Auxiliary Probabilistic Information , 2011, 2011 IEEE 11th International Conference on Data Mining.

[32]  Patrick J. F. Groenen,et al.  SVM-Maj: a majorization approach to linear support vector machines with different hinge errors , 2007, Adv. Data Anal. Classif..

[33]  Davide Anguita,et al.  A Learning Machine with a Bit-Based Hypothesis Space , 2013, ESANN.

[34]  Alexander K. Seewald,et al.  Lambda pruning: an approximation of the string subsequence kernel for practical SVM classification and redundancy clustering , 2007, Adv. Data Anal. Classif..

[35]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[36]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[37]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[38]  Davide Anguita,et al.  In-sample Model Selection for Trimmed Hinge Loss Support Vector Machine , 2012, Neural Processing Letters.

[39]  R. Fletcher Practical Methods of Optimization , 1988 .

[40]  Kaizhu Huang,et al.  Low Rank Metric Learning with Manifold Regularization , 2011, 2011 IEEE 11th International Conference on Data Mining.

[41]  Jingjing Lu,et al.  Comparing naive Bayes, decision trees, and SVM with AUC and accuracy , 2003, Third IEEE International Conference on Data Mining.

[42]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[43]  Yingjie Tian,et al.  Unsupervised and Semi-Supervised Two-class Support Vector Machines , 2006, ICDM Workshops.