Evolution strategies based adaptive Lp LS-SVM

Not only different databases but two classes of data within a database can also have different data structures. SVM and LS-SVM typically minimize the empirical @f-risk; regularized versions subject to fixed penalty (L"2 or L"1 penalty) are non-adaptive since their penalty forms are pre-determined. They often perform well only for certain types of situations. For example, LS-SVM with L"2 penalty is not preferred if the underlying model is sparse. This paper proposes an adaptive penalty learning procedure called evolution strategies (ES) based adaptive L"p least squares support vector machine (ES-based L"p LS-SVM) to address the above issue. By introducing multiple kernels, a L"p penalty based nonlinear objective function is derived. The iterative re-weighted minimal solver (IRMS) algorithm is used to solve the nonlinear function. Then evolution strategies (ES) is used to solve the multi-parameters optimization problem. Penalty parameterp, kernel and regularized parameters are adaptively selected by the proposed ES-based algorithm in the process of training the data, which makes it easier to achieve the optimal solution. Numerical experiments are conducted on two artificial data sets and six real world data sets. The experiment results show that the proposed procedure offer better generalization performance than the standard SVM, the LS-SVM and other improved algorithms.

[1]  Taeshik Shon,et al.  A hybrid machine learning approach to network anomaly detection , 2007, Inf. Sci..

[2]  Weixuan Xu,et al.  Sparse and robust least squares support vector machine: A linear programming formulation , 2007, 2007 IEEE International Conference on Grey Systems and Intelligent Services.

[3]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.

[4]  Ingrid Daubechies,et al.  Time-frequency localization operators: A geometric phase space approach , 1988, IEEE Trans. Inf. Theory.

[5]  Christian Igel,et al.  Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts , 2006, ICANN.

[6]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[7]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[8]  E. Oja,et al.  Independent Component Analysis , 2001 .

[9]  Weidong Zhang,et al.  Improved sparse least-squares support vector machine classifiers , 2006, Neurocomputing.

[10]  Ronald M. Summers,et al.  Feature selection for computer-aided polyp detection using genetic algorithms , 2003, SPIE Medical Imaging.

[11]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[12]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[13]  Wei-Chun Kao,et al.  Radius Margin Bounds for Support Vector . . . , 2003 .

[14]  Bhaskar D. Rao,et al.  An affine scaling methodology for best basis selection , 1999, IEEE Trans. Signal Process..

[15]  Hung-Hsu Tsai,et al.  Color image watermark extraction based on support vector machines , 2007, Inf. Sci..

[16]  Pando G. Georgiev,et al.  SPARSE COMPONENT ANALYSIS BY IMPROVED BASIS PURSUIT METHOD , 2003 .

[17]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[18]  Masakazu Muramatsu,et al.  An Efficient Support Vector Machine Learning Method with Second-Order Cone Programming for Large-Scale Problems , 2005, Applied Intelligence.

[19]  Stuart R. DeGraaf,et al.  SAR imaging via modern 2-D spectral estimation methods , 1998, IEEE Trans. Image Process..

[20]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[21]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[22]  S. Nash,et al.  Linear and Nonlinear Programming , 1987 .

[23]  D. Donoho Sparse Components of Images and Optimal Atomic Decompositions , 2001 .

[24]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[25]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[26]  Bernhard Schölkopf,et al.  Feature Selection for Support Vector Machines Using Genetic Algorithms , 2004, Int. J. Artif. Intell. Tools.

[27]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[28]  B. Ripley,et al.  Robust Statistics , 2018, Wiley Series in Probability and Statistics.

[29]  Johan A. K. Suykens,et al.  Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[30]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[31]  Guangyi Chen,et al.  Pattern recognition with SVM and dual-tree complex wavelets , 2007, Image Vis. Comput..

[32]  Trevor Hastie,et al.  Discussion of Boosting Papers , 2003 .

[33]  Liu Xiao,et al.  A Sparse Least Squares Support Vector Machine Classifier , 2007 .

[34]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[35]  Qinghua Hu,et al.  A weighted rough set based method developed for class imbalance learning , 2008, Inf. Sci..

[36]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[37]  Jie Yang,et al.  Optimizing the hyper-parameters for SVM by combining evolution strategies with a grid search , 2006 .

[38]  Sergio D. Cabrera,et al.  SAR Image Superresolution via 2-D Adaptive Extrapolation , 2003, Multidimens. Syst. Signal Process..

[39]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[40]  Yufeng Liu,et al.  Support vector machines with adaptive Lq penalty , 2007, Comput. Stat. Data Anal..

[41]  David R. Musicant,et al.  Large Scale Kernel Regression via Linear Programming , 2002, Machine Learning.

[42]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[43]  Johan A. K. Suykens,et al.  Sparse approximation using least squares support vector machines , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[44]  Gene H. Golub,et al.  Matrix computations , 1983 .

[45]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[46]  Antônio de Pádua Braga,et al.  RRS + LS-SVM: a new strategy for “a priori” sample selection , 2007, Neural Computing and Applications.

[47]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[48]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[49]  Thomas Philip Runarsson,et al.  Asynchronous Parallel Evolutionary Model Selection for Support Vector Machines , 2004 .

[50]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[51]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[52]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[53]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[54]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[55]  Alejandro Enrique Brito Iterative adaptive extrapolation applied to SAR image formation and sinusoidal recovery , 2001 .

[56]  Sergio D. Cabrera,et al.  Interior-Point Methods in l1 Optimal Sparse Representation Algorithms for Harmonic Retrieval , 2004 .