On ψ-Learning

The concept of large margins have been recognized as an important principle in analyzing learning methodologies, including boosting, neural networks, and support vector machines (SVMs). However, this concept alone is not adequate for learning in nonseparable cases. We propose a learning methodology, called ψ-learning, that is derived from a direct consideration of generalization errors. We provide a theory for ψ-learning and show that it essentially attains the optimal rates of convergence in two learning examples. Finally, results from simulation studies and from breast cancer classification confirm the ability of ψ-learning to outperform SVM in generalization.

[1]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[2]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  P. Massart,et al.  Rates of convergence for minimum contrast estimators , 1993 .

[5]  S. Geer Hellinger-Consistency of Certain Nonparametric Maximum Likelihood Estimators , 1993 .

[6]  W. Wong,et al.  Convergence Rate of Sieve Estimates , 1994 .

[7]  W. Wong,et al.  Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .

[8]  Le Thi Hoai An,et al.  Solving a Class of Linearly Constrained Indefinite Quadratic Problems by D.C. Algorithms , 1997, J. Glob. Optim..

[9]  P. Bartlett,et al.  Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .

[10]  Xiaotong Shen ON THE METHOD OF PENALIZATION , 1998 .

[11]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[12]  E. Mammen,et al.  Smooth Discrimination Analysis , 1999 .

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[15]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[16]  Yi Lin,et al.  Some Asymptotic Properties of the Support Vector Machine , 2002 .

[17]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.