Deterministic convergence of complex mini-batch gradient learning algorithm for fully complex-valued neural networks

Abstract This paper investigates the fully complex mini-batch gradient algorithm for training complex-valued neural networks. Mini-batch gradient method has been widely used in neural network training, however, its convergence analysis is usually restricted to real-valued neural networks and of probability nature. By introducing a new Taylor mean value theorem for analytic functions, in this paper we establish deterministic convergence results for the fully complex mini-batch gradient algorithm under mild conditions. The deterministic convergence here means that the algorithm will deterministically converge, and both the weak convergence and strong convergence will be proved. Benefited from the newly introduced mean value theorem, our results are of global nature in that they are valid for arbitrarily given initial values of the weights. The theoretical findings are validated with a simulation example.

[1]  Danilo P. Mandic,et al.  Performance analysis of the deficient length augmented CLMS algorithm for second order noncircular complex signals , 2018, Signal Process..

[2]  Ying Zhang,et al.  Boundedness and Convergence of Split-Complex Back-Propagation Algorithm with Momentum and Penalty , 2013, Neural Processing Letters.

[3]  Hualiang Li,et al.  Complex ICA Using Nonlinear Functions , 2008, IEEE Transactions on Signal Processing.

[4]  Fang Liu,et al.  An adaptive neuro-complex-fuzzy-inferential modeling mechanism for generating higher-order TSK models , 2019, Neurocomputing.

[5]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[7]  Takéhiko Nakama,et al.  Theoretical analysis of batch and on-line training for gradient descent learning in neural networks , 2009, Neurocomputing.

[8]  John N. Tsitsiklis,et al.  Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..

[9]  Wei Wu,et al.  Convergence analysis of online gradient method for BP neural networks , 2011, Neural Networks.

[10]  Cris Koutsougeras,et al.  Complex domain backpropagation , 1992 .

[11]  Akira Hirose,et al.  Generalization Characteristics of Complex-Valued Feedforward Neural Networks in Relation to Signal Coherence , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Danilo P. Mandic,et al.  Convergence analysis of an augmented algorithm for fully complex-valued neural networks , 2015, Neural Networks.

[13]  Wei Wu,et al.  Convergence of Cyclic and Almost-Cyclic Learning With Momentum for Feedforward Neural Networks , 2011, IEEE Transactions on Neural Networks.

[14]  Danilo P. Mandic,et al.  Stochastic Gradient-Adaptive Complex-Valued Nonlinear Neural Adaptive Filters With a Gradient-Adaptive Step Size , 2007, IEEE Transactions on Neural Networks.

[15]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[16]  R. M. McLeod,et al.  Mean Value Theorems for Vector Valued Functions , 1965, Proceedings of the Edinburgh Mathematical Society.

[17]  Tony R. Martinez,et al.  The general inefficiency of batch training for gradient descent learning , 2003, Neural Networks.

[18]  Jinde Cao,et al.  Fully complex conjugate gradient-based neural networks using Wirtinger calculus framework: Deterministic convergence and its application , 2019, Neural Networks.

[19]  Yan Liu,et al.  Convergence analysis of the batch gradient-based neuro-fuzzy learning algorithm with smoothing L1/2 regularization for the first-order Takagi-Sugeno system , 2017, Fuzzy Sets Syst..

[20]  Hualiang Li,et al.  Complex-Valued Adaptive Signal Processing Using Nonlinear Functions , 2008, EURASIP J. Adv. Signal Process..

[21]  Hongmei Shao,et al.  Convergence of an online gradient method with inner-product penalty and adaptive momentum , 2012, Neurocomputing.

[22]  Kazuyuki Aihara,et al.  Complex-valued forecasting of wind profile , 2006 .

[23]  Danilo P. Mandic,et al.  A Complex-Valued RTRL Algorithm for Recurrent Neural Networks , 2004, Neural Computation.

[24]  Sotirios A. Tsaftaris,et al.  The Generalized Complex Kernel Least-Mean-Square Algorithm , 2019, IEEE Transactions on Signal Processing.

[25]  Tohru Nitta,et al.  An Extension of the Back-Propagation Algorithm to Complex Numbers , 1997, Neural Networks.

[26]  Tülay Adali,et al.  Approximation by Fully Complex Multilayer Perceptrons , 2003, Neural Computation.

[27]  He Huang,et al.  Adaptive complex-valued stepsize based fast learning of complex-valued neural networks , 2020, Neural Networks.

[28]  Lijun Liu,et al.  Convergence Analysis of Three Classes of Split-Complex Gradient Algorithms for Complex-Valued Recurrent Neural Networks , 2010, Neural Computation.

[29]  Danilo P. Mandic,et al.  Augmented Performance Bounds on Strictly Linear and Widely Linear Estimators With Complex Data , 2018, IEEE Transactions on Signal Processing.

[30]  Andrew Chi-Sing Leung,et al.  Convergence Analyses on On-Line Weight Noise Injection-Based Training Algorithms for MLPs , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Terrence L. Fine,et al.  Parameter Convergence and Learning Curves for Neural Networks , 1999, Neural Computation.

[32]  Henry Leung,et al.  The complex backpropagation algorithm , 1991, IEEE Trans. Signal Process..

[33]  Danilo P. Mandic,et al.  Is a Complex-Valued Stepsize Advantageous in Complex-Valued Gradient Learning Algorithms? , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[34]  B. A. D. H. Brandwood A complex gradient operator and its applica-tion in adaptive array theory , 1983 .

[35]  Xiaodong Liu,et al.  Convergence analysis of fully complex backpropagation algorithm based on Wirtinger calculus , 2013, Cognitive Neurodynamics.

[36]  Wei Wu,et al.  Boundedness and Convergence of Online Gradient Method With Penalty for Feedforward Neural Networks , 2009, IEEE Transactions on Neural Networks.

[37]  Danilo P. Mandic,et al.  Full Mean Square Performance Bounds on Quaternion Estimators for Improper Data , 2019, IEEE Transactions on Signal Processing.

[38]  Gaofeng Zheng,et al.  Convergence of a Batch Gradient Algorithm with Adaptive Momentum for Neural Networks , 2011, Neural Processing Letters.

[39]  Luxi Yang,et al.  Analysis of the Unconstrained Frequency-Domain Block LMS for Second-Order Noncircular Inputs , 2019, IEEE Transactions on Signal Processing.

[40]  Wei Wu,et al.  Convergence of gradient method with momentum for two-Layer feedforward neural networks , 2006, IEEE Transactions on Neural Networks.