论文信息 - Improving the Convergence Property of Soft Committee Machines by Replacing Derivative with Truncated Gaussian Function

Improving the Convergence Property of Soft Committee Machines by Replacing Derivative with Truncated Gaussian Function

In online gradient descent learning, the local property of the derivative of the output function can cause slow convergence. This phenomenon, called a plateau, occurs in the learning process of a multilayer network. Improving the derivative term, we propose a simple method replacing the derivative term with a truncated Gaussian function that greatly increases the convergence speed. We then analyze a soft committee machine trained by proposed method, and show how proposed method breaks a plateau. Results showed that the proposed method eventually led to break the symmetry between hidden units.

Kazuyuki Hara | Kentaro Katahira

[1] M. Rattray,et al. Incorporating curvature information into on-line learning , 1999 .

[2] Michael Biehl,et al. Learning by on-line gradient descent , 1995 .

[3] Kenji Fukumizu,et al. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.

[4] Saad,et al. On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[5] Masato Okada,et al. Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning , 2012, ICANN.

[6] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.