Model Selection for Regularized Least-Squares Classification

Regularized Least-Squares Classification (RLSC) can be regarded as a kind of 2 layers neural network using regularized square loss function and kernel trick. Poggio and Smale recently reformulated it in the framework of the mathematical foundations of learning and called it a key algorithm of learning theory. The generalization performance of RLSC depends heavily on the setting of its kernel and hyper parameters. Therefore we presented a novel two-step approach for optimal parameters selection: firstly the optimal kernel parameters are selected by maximizing kernel target alignment, and then the optimal hyper-parameter is determined via minimizing RLSC's leave-one-out bound. Compared with traditional grid search, our method needs no independent validation set. We worked on IDA's benchmark datasets using Gaussian kernel, the results demonstrate that our method is feasible and time efficient.

[1]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[2]  T. Poggio,et al.  Regularized Least-Squares Classification 133 In practice , although , 2007 .

[3]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[4]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, ICPR 2004.

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[7]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[8]  David Haussler,et al.  Probabilistic kernel regression models , 1999, AISTATS.

[9]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[10]  Ramakrishnan Srikant,et al.  Kdd-2001: Proceedings of the Seventh Acm Sigkdd International Conference on Knowledge Discovery and Data Mining : August 26-29, 2001 San Francisco, Ca, USA , 2002 .

[11]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[12]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[13]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[14]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[15]  Johan A. K. Suykens,et al.  Advances in learning theory : methods, models and applications , 2003 .