Fast and Stable Learning Utilizing Singular Regions of Multilayer Perceptron

In the parameter space of MLP(J), multilayer perceptron with J hidden units, there exist flat areas called singular regions created by applying reducibility mappings to the optimal solution of MLP($$J-1$$). Since such singular regions cause serious stagnation of learning, a learning method to avoid singular regions has been desired. However, such avoiding does not guarantee the quality of the final solutions. This paper proposes a new learning method which does not avoid but makes good use of singular regions to stably and successively find excellent solutions commensurate with MLP(J). The proposed method worked well in our experiments using artificial and real data sets.

[1]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[2]  渡邊 澄夫 Algebraic geometry and statistical learning theory , 2009 .

[3]  Kenji Fukumizu,et al.  Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.

[4]  Shun-ichi Amari,et al.  Dynamics of Learning in Multilayer Perceptrons Near Singularities , 2008, IEEE Transactions on Neural Networks.

[5]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[6]  Chen-Han Sung Temporal knowledge: Recognition and learning of time-based patterns , 1988, Neural Networks.

[7]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[8]  Sumio Watanabe,et al.  A formula of equations of states in singular learning machines , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[9]  Kazumi Saito,et al.  Discovering Polynomials to Fit Multivariate Data Having Numeric and Nominal Variables , 2002, Progress in Discovery Science.

[10]  David G. Stork,et al.  Pattern Classification , 1973 .

[11]  Leonard G. C. Hamey,et al.  XOR has no local minima: A case study in neural network error surface analysis , 1998, Neural Networks.

[12]  Ryohei Nakano,et al.  Learning Method Utilizing Singular Region of Multilayer Perceptron , 2011, IJCCI.

[13]  Kenji Fukumizu,et al.  Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.

[14]  Weishui Wan,et al.  Implementing online natural gradient learning: problems and solutions , 2006, IEEE Trans. Neural Networks.

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  Kazumi Saito,et al.  Partial BFGS Update and Efficient Step-Length Calculation for Three-Layer Neural Networks , 1997, Neural Computation.

[17]  Robert Hecht-Nielsen,et al.  Neural network tomography: Network replication from output surface geometry , 2011, Neural Networks.