Stability analysis of opposite singularity in multilayer perceptrons

Abstract For the bipolar-activation-function multilayer perceptrons (MLPs), there exist opposite singularities in the parameter space. The Fisher information matrix degenerates on the opposite singularity which causes strange learning behaviors. As the stability is the fundamental to analyze the properties of the opposite singularity, this paper concerns the stability analysis of the opposite singularity in MLPs. The analytical form of the best approximation on the opposite singularity is obtained at first, then the concrete expression of Hessian matrix can be obtained. By analyzing the eigenvalues of Hessian matrix on the opposite singularity, the stability of the opposite singularity is investigated. Finally, two experiments are taken to verify the obtained results.

[1]  Liang Sun,et al.  Path following control for marine surface vessel with uncertainties and input saturation , 2016, Neurocomputing.

[2]  Kan-Jian Zhang,et al.  Theoretical and numerical analysis of learning dynamics near singularity in multilayer perceptrons , 2015, Neurocomputing.

[3]  Shun-ichi Amari,et al.  Dynamics of Learning in Multilayer Perceptrons Near Singularities , 2008, IEEE Transactions on Neural Networks.

[4]  Kenji Fukumizu,et al.  Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.

[5]  Shun-ichi Amari,et al.  Dynamics of learning near singularities in radial basis function networks , 2008, Neural Networks.

[6]  Haibo He,et al.  Data-Driven Tracking Control With Adaptive Dynamic Programming for a Class of Continuous-Time Nonlinear Systems , 2017, IEEE Transactions on Cybernetics.

[7]  Shun-ichi Amari,et al.  Dynamics of Learning In Hierarchical Models – Singularity and Milnor Attractor , 2011 .

[8]  S. Amari,et al.  Differential and Algebraic Geometry of Multilayer Perceptrons , 2001 .

[9]  Kan-Jian Zhang,et al.  Averaged learning equations of error-function-based multilayer perceptrons , 2014, Neural Computing and Applications.

[10]  Shun-ichi Amari,et al.  Dynamics of Learning Near Singularities in Layered Networks , 2008, Neural Computation.

[11]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[12]  Sumio Watanabe,et al.  Algebraic Analysis for Nonidentifiable Learning Machines , 2001, Neural Computation.

[13]  Chi Zhang,et al.  Natural Gradient Learning Algorithms for RBF Networks , 2015, Neural Computation.

[14]  Shun-ichi Amari,et al.  Singularities Affect Dynamics of Learning in Neuromanifolds , 2006, Neural Comput..

[15]  Hoda Mohammadzade,et al.  Cuff-less high-accuracy calibration-free blood pressure estimation using pulse transit time , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[16]  Michael Biehl,et al.  Learning by on-line gradient descent , 1995 .

[17]  Sumio Watanabe Algebraic geometrical methods for hierarchical learning machines , 2001, Neural Networks.

[18]  Masato Okada,et al.  On-Line Learning Dynamics of Multilayer Perceptrons with Unidentifiable Parameters , 2003 .

[19]  Hyeyoung Park,et al.  Singularity and Slow Convergence of the EM algorithm for Gaussian Mixtures , 2009, Neural Processing Letters.

[20]  Saad,et al.  Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.

[21]  Haibo He,et al.  Air-Breathing Hypersonic Vehicle Tracking Control Based on Adaptive Dynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Sumio Watanabe,et al.  A widely applicable Bayesian information criterion , 2012, J. Mach. Learn. Res..