An Ensemble Model for Error Modeling with Pseudoinverse Learning Algorithm

In Bayesian theory, the maximum posterior estimator uses prior information to estimate the noise in the machine learning model by adding the regularization term. The regularization terms L1 and L2 correspond to Laplacian prior and Guassian prior, respectively. In existing deep learning models, in order to use the gradient descent optimization algorithm and achieve good results, most models take L2 regularization as the regularization term of the network model to fit the complex Guassian noise. However in practice, the Laplace noise and the Guassian noise are both considered as data noise. For multi-layer perceptrons, the difficulty caused by adding L1 and L2 into the optimization function of the network is solved by proposing an ensemble model for error modeling through adopting the divide and conquer strategy. First, several base learners are trained to fit different noise distributions of data, then the final results can be obtained by taking the results of each base leaner as new data to train a meta leaner, and get the final results. Among them, coordinate regression method is used to solve L1 loss, while the pseudo-inverse learning algorithm is employed to solve L2 loss. Both methods are nongradient optimization algorithms. The comparison results of the model on several data sets show that the proposed ensemble model achieves better performance.

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  Deyu Meng,et al.  Robust Matrix Factorization with Unknown Noise , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[5]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[6]  Lei Zhang,et al.  Robust Principal Component Analysis with Complex Noise , 2014, ICML.

[7]  C. L. Philip Chen,et al.  Regularization parameter estimation for feedforward neural networks , 2003 .

[8]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[9]  Ke Wang,et al.  Deep neural networks with local connectivity and its application to astronomical spectral data , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[10]  Huchuan Lu,et al.  Subspace clustering by Mixture of Gaussian Regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ping Guo,et al.  A new automated spectral feature extraction method and its application in spectral classification and defective spectra recovery , 2017 .

[12]  Ke Wang,et al.  Autoencoder, low rank approximation and pseudoinverse learning algorithm , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[13]  Michael R. Lyu,et al.  A pseudoinverse learning algorithm for feedforward neural networks with stacked generalization applications to software reliability growth data , 2004, Neurocomputing.

[14]  Dit-Yan Yeung,et al.  Bayesian adaptive matrix factorization with automatic model selection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).