Slow Dynamics Due to Singularities of Hierarchical Learning Machines

Recently, slow dynamics in learning of neural networks has been known to be closely related to singularities, which exist in parameter spaces of hierarchical learning models. To show the influence of singular structure on learning dynamics, we take statistical mechanical approaches and investigate online-learning dynamics under various learning scenario with different relationship between optimum and singularities. From the investigation, we found a quasi-plateau phenomenon which differs from the well known plateau. The quasi-plateau and plateau become extremely serious when an optimal point is in a neighborhood of a singularity. The quasi-plateau and plateau disappear in the natural gradient learning, which takes singular structures into account and uses Riemannian measure for the parameter space.