Relaxation of Hard Classification Targets for LSE Minimization

In the spirit of stabilizing a solution to handle possible over-fitting of data which is especially common for high order models, we propose a relaxed target training method for regression models which are linear in parameters. This relaxation of training target from the conventional binary values to disjoint classification spaces provides good classification fidelity according to a threshold treatment during the decision process. A particular design to relax the training target is provided under practical consideration. Extension to multiple class problems is formulated before the method is applied to a plug-in full multivariate polynomial model and a reduced model on synthetic data sets to illustrate the idea. Additional experiments were performed using real-world data from the UCI[1] data repository to derive certain empirical evidence.

[1]  Ulf Jeppsson,et al.  MATLAB™ and Simulink™ , 2002 .

[2]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[5]  Xudong Jiang,et al.  A reduced multivariate polynomial model for multimodal biometrics and classifiers fusion , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Gerald W. Kimble,et al.  Information and Computer Science , 1975 .

[7]  Kar-Ann Toh,et al.  Benchmarking a reduced multivariate polynomial pattern classifier , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[9]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[11]  Jürgen Schürmann,et al.  Pattern classification , 2008 .

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[14]  Yoram Baram,et al.  Partial Classification: The Benefit of Deferred Decision , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Yoram Baram,et al.  Soft nearest neighbor classification , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).