Efficient Ridge Solutions for the Incremental Broad Learning System on Added Inputs by Updating the Inverse or the Inverse Cholesky Factor of the Hermitian matrix in the Ridge Inverse

This brief proposes two BLS algorithms to improve the existing BLS for new added inputs in [7]. The proposed BLS algorithms avoid computing the ridge inverse, by computing the ridge solution (i.e., the output weights) from the inverse or the inverse Cholesky factor of the Hermitian matrix in the ridge inverse. The proposed BLS algorithm 1 updates the inverse of the Hermitian matrix by the matrix inversion lemma [12]. To update the upper-triangular inverse Cholesky factor of the Hermitian matrix, the proposed BLS algorithm 2 multiplies the inverse Cholesky factor with an upper-triangular intermediate matrix, which is computed by a Cholesky factorization or an inverse Cholesky factorization. Assume that the newly added input matrix corresponding to the added inputs is p * k, where p and k are the number of added training samples and the total node number, respectively. When p > k, the inverse of a sum of matrices [11] is utilized to compute the intermediate variables by a smaller matrix inverse in the proposed algorithm 1, or by a smaller inverse Cholesky factorization in the proposed algorithm 2. Usually the Hermitian matrix in the ridge inverse is smaller than the ridge inverse. Thus the proposed algorithms 1 and 2 require less flops (floating-point operations) than the existing BLS algorithm, which is verified by the theoretical flops calculation. In numerical experiments, the speedups for the case of p > k in each additional training time of the proposed BLS algorithms 1 and 2 over the existing algorithm are 1.95 - 5.43 and 2.29 - 6.34, respectively, and the speedups for the case of p < k are 8.83 - 10.21 and 2.28 - 2.58, respectively.

[1]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[2]  C. L. Philip Chen,et al.  Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[3]  C. L. Philip Chen,et al.  Universal Approximation Capability of Broad Learning System and Its Structural Variations , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Isabelle Guyon,et al.  Neural Network Recognizer for Hand-Written Zip Code Digits , 1988, NIPS.

[5]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[6]  M. Ylinen,et al.  A fixed-point implementation of matrix inversion using Cholesky decomposition , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[7]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[8]  Donald W. Marquaridt Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation , 1970 .

[9]  G. Sohie,et al.  Generalization of the matrix inversion lemma , 1986, Proceedings of the IEEE.

[10]  Bin Li,et al.  An Improved Square-Root Algorithm for V-BLAST Based on Efficient Inverse Cholesky Factorization , 2020, IEEE Transactions on Wireless Communications.

[11]  C. L. Philip Chen,et al.  A rapid learning and dynamic stepwise updating algorithm for flat neural networks and the application to time-series prediction , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Hufei Zhu,et al.  Reducing the Computational Complexity of Pseudoinverse for the Incremental Broad Learning System on Added Inputs , 2019, ArXiv.

[13]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Dejan J. Sobajic,et al.  Learning and generalization characteristics of the random vector Functional-link net , 1994, Neurocomputing.