A probabilistic learning algorithm for robust modeling using neural networks with random weights

Robust modeling approaches have received considerable attention due to its practical value to deal with the presence of outliers in data. This paper proposes a probabilistic robust learning algorithm for neural networks with random weights (NNRWs) to improve the modeling performance. The robust NNRW model is trained by optimizing a hybrid regularization loss function according to the sparsity of outliers and compressive sensing theory. The well-known expectation maximization (EM) algorithm is employed to implement our proposed algorithm under some assumptions on noise distribution. Experimental results on function approximation as well as UCI data sets for regression and classification demonstrate that the proposed algorithm is promising with good potential for real world applications.

[1]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[2]  Anders P. Eriksson,et al.  Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L1 norm , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Feilong Cao,et al.  Sparse algorithms of Random Weight Networks and applications , 2014, Expert Syst. Appl..

[4]  D. F. Andrews,et al.  Scale Mixtures of Normal Distributions , 1974 .

[5]  Xinbo Gao,et al.  Robust tensor subspace learning for anomaly detection , 2011, Int. J. Mach. Learn. Cybern..

[6]  Shun-Feng Su,et al.  The annealing robust backpropagation (ARBP) learning algorithm , 2000, IEEE Trans. Neural Networks Learn. Syst..

[7]  Tommy W. S. Chow,et al.  Comments on "Stochastic choice of basis functions in adaptive function approximation and the functional-link net" [and reply] , 1997, IEEE Trans. Neural Networks.

[8]  Robert P. W. Duin,et al.  Feedforward neural networks with random weights , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[9]  Fabio Roli,et al.  Multiple classifier systems for robust classifier design in adversarial environments , 2010, Int. J. Mach. Learn. Cybern..

[10]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[11]  Yu-Lin He,et al.  Non-Naive Bayesian Classifiers for Classification Problems With Continuous Attributes , 2014, IEEE Transactions on Cybernetics.

[12]  Jingdong Wang,et al.  A Probabilistic Approach to Robust Matrix Factorization , 2012, ECCV.

[13]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[14]  Johan A. K. Suykens,et al.  Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[15]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[16]  Samuel Kotz,et al.  The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance , 2001 .

[17]  K. S. Banerjee Generalized Inverse of Matrices and Its Applications , 1973 .

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Kadir Liano,et al.  Robust error measure for supervised neural network learning with outliers , 1996, IEEE Trans. Neural Networks.

[20]  Ramesh C. Jain,et al.  A robust backpropagation learning algorithm for function approximation , 1994, IEEE Trans. Neural Networks.

[21]  Chen-Chia Chuang,et al.  Robust least squares-support vector machines for regression with outliers , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[22]  Y. Takefuji,et al.  Functional-link net computing: theory, system architecture, and functionalities , 1992, Computer.

[23]  Jiye Liang,et al.  A simple and effective outlier detection algorithm for categorical data , 2014, Int. J. Mach. Learn. Cybern..

[24]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[26]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[27]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[28]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[29]  Shun-Feng Su,et al.  Robust support vector regression networks for function approximation with outliers , 2002, IEEE Trans. Neural Networks.

[30]  Lei Huang,et al.  Robust interval regression analysis using neural networks , 1998, Fuzzy Sets Syst..

[31]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[32]  Haixian Wang,et al.  Block principal component analysis with L1-norm for image analysis , 2012, Pattern Recognit. Lett..

[33]  K. Lange,et al.  Normal/Independent Distributions and Their Applications in Robust Regression , 1993 .

[34]  Zhihua Zhang,et al.  EP-GIG Priors and Applications in Bayesian Sparse Learning , 2012, J. Mach. Learn. Res..

[35]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[36]  D. L. Donoho,et al.  Compressed sensing , 2006, IEEE Trans. Inf. Theory.

[37]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[38]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[39]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[40]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[41]  Dejan J. Sobajic,et al.  Learning and generalization characteristics of the random vector Functional-link net , 1994, Neurocomputing.