Learning With Coefficient-Based Regularized Regression on Markov Resampling

Big data research has become a globally hot topic in recent years. One of the core problems in big data learning is how to extract effective information from the huge data. In this paper, we propose a Markov resampling algorithm to draw useful samples for handling coefficient-based regularized regression (CBRR) problem. The proposed Markov resampling algorithm is a selective sampling method, which can automatically select uniformly ergodic Markov chain (u.e.M.c.) samples according to transition probabilities. Based on u.e.M.c. samples, we analyze the theoretical performance of CBRR algorithm and generalize the existing results on independent and identically distributed observations. To be specific, when the kernel is infinitely differentiable, the learning rate depending on the sample size <inline-formula> <tex-math notation="LaTeX">$m$ </tex-math></inline-formula> can be arbitrarily close to <inline-formula> <tex-math notation="LaTeX">$\mathcal {O}(m^{-1})$ </tex-math></inline-formula> under a mild regularity condition on the regression function. The good generalization ability of the proposed method is validated by experiments on simulated and real data sets.

[1]  Yiming Ying,et al.  Support Vector Machine Soft Margin Classifiers: Error Analysis , 2004, J. Mach. Learn. Res..

[2]  Stefan Berchtold,et al.  Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..

[3]  Bin Zou,et al.  Generalization performance of least-square regularized regression algorithm with Markov chain samples , 2012 .

[4]  Yuan Yan Tang,et al.  $k$ -Times Markov Sampling for SVMC , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Danilo Rastovic,et al.  Targeting and synchronization at tokamak with recurrent artificial neural networks , 2012, Neural Computing and Applications.

[6]  Jie Xu Optimal rate for support vector machine regression with Markov chain samples , 2014, Int. J. Wavelets Multiresolution Inf. Process..

[7]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[8]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[9]  K. Marton A measure concentration inequality for contracting markov chains , 1996 .

[10]  Ding-Xuan Zhou,et al.  Capacity of reproducing kernel spaces in learning theory , 2003, IEEE Transactions on Information Theory.

[11]  Jian Li Wang,et al.  Learning rates for least square regressions with coefficient regularization , 2012 .

[12]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[13]  Sergios Theodoridis,et al.  Complex Support Vector Machines for Regression and Quaternary Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Mathukumalli Vidyasagar,et al.  Learning and Generalization: With Applications to Neural Networks , 2002 .

[15]  Danilo Rastovic TOKAMAK DESIGN AS ONE SUSTAINABLE SYSTEM , 2011 .

[16]  Cheol Hoon Park,et al.  Hybrid Simulated Annealing and Its Application to Optimization of Hidden Markov Models for Visual Speech Recognition , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Yuan Yan Tang,et al.  The Generalization Performance of Regularized Regression Algorithms Based on Markov Sampling , 2014, IEEE Transactions on Cybernetics.

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  Qiang Wu,et al.  Least square regression with indefinite kernels and coefficient regularization , 2011 .

[20]  Jie Xu,et al.  Generalization performance of Gaussian kernels SVMC based on Markov sampling , 2014, Neural Networks.

[21]  Chao Zhang,et al.  Generalization Bounds of ERM-Based Learning Processes for Continuous-Time Markov Chains , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Andreas Christmann,et al.  Fast Learning from Non-i.i.d. Observations , 2009, NIPS.

[23]  Peng Shi,et al.  Mixed H-Infinity and Passive Filtering for Discrete Fuzzy Neural Networks With Stochastic Jumps and Time Delays , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Morteza Mardani,et al.  Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors , 2014, IEEE Transactions on Signal Processing.

[25]  Yuan Yan Tang,et al.  Error Analysis of Coefficient-Based Regularized Algorithm for Density-Level Detection , 2013, Neural Computation.

[26]  Dan Simon,et al.  Markov Models for Biogeography-Based Optimization , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  A. Barron,et al.  Approximation and learning by greedy algorithms , 2008, 0803.1718.

[28]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[29]  Lei Shi Learning theory estimates for coefficient-based regularized regression , 2013 .

[30]  Zongben Xu,et al.  Learning With $\ell _{1}$ -Regularizer Based on Markov Resampling , 2016, IEEE Transactions on Cybernetics.

[31]  Baohuai Sheng,et al.  The performance of semi-supervised Laplacian regularized regression with the least square loss , 2017, Int. J. Wavelets Multiresolution Inf. Process..

[32]  Jie Xu,et al.  Convergence and consistency of ERM algorithm with uniformly ergodic Markov chain samples , 2016, Int. J. Wavelets Multiresolution Inf. Process..

[33]  Paul-Marie Samson,et al.  Concentration of measure inequalities for Markov chains and $\Phi$-mixing processes , 2000 .

[34]  Yiming Ying,et al.  Learning Rates of Least-Square Regularized Regression , 2006, Found. Comput. Math..

[35]  Yiming Ying,et al.  Online Regularized Classification Algorithms , 2006, IEEE Transactions on Information Theory.

[36]  Ding-Xuan Zhou,et al.  The covering number in learning theory , 2002, J. Complex..

[37]  Xin Yao,et al.  Dynamic Sampling Approach to Training Neural Networks for Multiclass Imbalance Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Danilo Rastovic Fractional Fokker–Planck Equations and Artificial Neural Networks for Stochastic Control of Tokamak , 2008 .

[39]  Ding-Xuan Zhou,et al.  Learning with sample dependent hypothesis spaces , 2008, Comput. Math. Appl..

[40]  Huaguang Zhang,et al.  Sampled-Data Synchronization Analysis of Markovian Neural Networks With Generally Incomplete Transition Rates , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.