Regularization schemes for minimum error entropy principle

We introduce a learning algorithm for regression generated by a minimum error entropy (MEE) principle and regularization schemes in reproducing kernel Hilbert spaces. This empirical MEE algorithm is highly related to a scaling parameter arising from Parzen windowing. The purpose of this paper is to carry out consistency analysis when the scaling parameter is large. Explicit learning rates are provided. Novel approaches are proposed to overcome the difficulties in bounding the output function uniformly and in the special MEE feature that the regression function may not be a minimizer of the error entropy.

[1]  Li Li,et al.  Support Vector Machines , 2015 .

[2]  Jun Fan,et al.  Learning theory approach to minimum error entropy criterion , 2012, J. Mach. Learn. Res..

[3]  Ting Hu,et al.  ONLINE REGRESSION WITH VARYING GAUSSIANS AND NON-IDENTICAL DISTRIBUTIONS , 2011 .

[4]  Luís A. Alexandre,et al.  The MEE Principle in Data Classification: A Perceptron-Based Analysis , 2010, Neural Computation.

[5]  Jose C. Principe,et al.  Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives , 2010, Information Theoretic Learning.

[6]  Yiming Ying,et al.  Multi-kernel regularized classifiers , 2007, J. Complex..

[7]  Yiming Ying,et al.  Learning Rates of Least-Square Regularized Regression , 2006, Found. Comput. Math..

[8]  Qiang Wu,et al.  Classification and regularization in learning theory , 2005 .

[9]  Luís A. Alexandre,et al.  Neural network classification using Shannon's entropy , 2005, ESANN.

[10]  Deniz Erdoğmuş,et al.  COMPARISON OF ENTROPY AND MEAN SQUARE ERROR CRITERIA IN ADAPTIVE SYSTEM TRAINING USING HIGHER ORDER STATISTICS , 2004 .

[11]  Deniz Erdogmus,et al.  Convergence properties and data efficiency of the minimum error entropy criterion in ADALINE training , 2003, IEEE Trans. Signal Process..

[12]  Deniz Erdogmus,et al.  Blind source separation using Renyi's -marginal entropies , 2002, Neurocomputing.

[13]  Deniz Erdogmus,et al.  An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems , 2002, IEEE Trans. Signal Process..

[14]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .