Accelerating Cyclic Update Algorithms for Parameter Estimation by Pattern Searches

A popular strategy for dealing with large parameter estimation problems is to split the problem into manageable subproblems and solve them cyclically one by one until convergence. A well-known drawback of this strategy is slow convergence in low noise conditions. We propose using so-called pattern searches which consist of an exploratory phase followed by a line search. During the exploratory phase, a search direction is determined by combining the individual updates of all subproblems. The approach can be used to speed up several well-known learning methods such as variational Bayesian learning (ensemble learning) and expectation-maximization algorithm with modest algorithmic modifications. Experimental results show that the proposed method is able to reduce the required convergence time by 60–85% in realistic variational Bayesian learning problems.

[1]  Robert Hooke,et al.  `` Direct Search'' Solution of Numerical and Statistical Problems , 1961, JACM.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  R. Fletcher Practical Methods of Optimization , 1988 .

[4]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[5]  R. Jennrich,et al.  Conjugate Gradient Acceleration of the EM Algorithm , 1993 .

[6]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms , 1993 .

[7]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[8]  David J. C. MacKay,et al.  Developments in Probabilistic Modelling with Neural Networks - Ensemble Learning , 1995, SNN Symposium on Neural Networks.

[9]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[10]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[11]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[12]  Antti Honkela,et al.  Bayesian Non-Linear Independent Component Analysis by Multi-Layer Perceptrons , 2000 .

[13]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[14]  E. Oja,et al.  Independent Component Analysis , 2013 .

[15]  J. Karhunen,et al.  Building Blocks for Hierarchical Latent Variable Models , 2001 .

[16]  Terrence J. Sejnowski,et al.  Variational Learning for Switching State-Space Models , 2001 .

[17]  James C. Bezdek,et al.  Some Notes on Alternating Optimization , 2002, AFSS.

[18]  Juha Karhunen,et al.  An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[19]  J. Karhunen,et al.  Nonlinear Independent Factor Analysis by Hierarchical Models , 2003 .