Simulated Stochastic Approximation Annealing for Global Optimization With a Square-Root Cooling Schedule

Simulated annealing has been widely used in the solution of optimization problems. As known by many researchers, the global optima cannot be guaranteed to be located by simulated annealing unless a logarithmic cooling schedule is used. However, the logarithmic cooling schedule is so slow that no one can afford to use this much CPU time. This article proposes a new stochastic optimization algorithm, the so-called simulated stochastic approximation annealing algorithm, which is a combination of simulated annealing and the stochastic approximation Monte Carlo algorithm. Under the framework of stochastic approximation, it is shown that the new algorithm can work with a cooling schedule in which the temperature can decrease much faster than in the logarithmic cooling schedule, for example, a square-root cooling schedule, while guaranteeing the global optima to be reached when the temperature tends to zero. The new algorithm has been tested on a few benchmark optimization problems, including feed-forward neural network training and protein-folding. The numerical results indicate that the new algorithm can significantly outperform simulated annealing and other competitors. Supplementary materials for this article are available online.

[1]  Bruno Apolloni,et al.  Simulated annealing approach in backpropagation , 1991, Neurocomputing.

[2]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[3]  H. Robbins A Stochastic Approximation Method , 1951 .

[4]  C. Robert,et al.  Controlled MCMC for Optimal Sampling , 2001 .

[5]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[6]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[7]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[8]  H. Chen,et al.  STOCHASTIC APPROXIMATION PROCEDURES WITH RANDOMLY VARYING TRUNCATIONS , 1986 .

[9]  D. Goldfarb A family of variable-metric methods derived by variational means , 1970 .

[10]  E. Nummelin General irreducible Markov chains and non-negative operators: List of symbols and notation , 1984 .

[11]  Bruce W. Schmeiser,et al.  General Hit-and-Run Monte Carlo sampling for evaluating multidimensional integrals , 1996, Oper. Res. Lett..

[12]  Peter Schmidt,et al.  Testing the restrictions implied by the rational expectations hypothesis , 1981 .

[13]  R. Tweedie,et al.  Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms , 1996 .

[14]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations , 1970 .

[15]  F. Liang Continuous Contour Monte Carlo for Marginal Density Estimation With an Application to a Spatial Statistical Model , 2007 .

[16]  Jun S. Liu,et al.  The Wang-Landau algorithm in general state spaces: Applications and convergence analysis , 2010 .

[17]  V. Fabian STOCHASTIC APPROXIMATION METHODS , 1960 .

[18]  F. Kong,et al.  A stochastic approximation algorithm with Markov chain Monte-carlo method for incomplete data estimation problems. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Michel Installe,et al.  Stochastic approximation methods , 1978 .

[20]  Carlos A. Coello Coello,et al.  Asymptotic convergence of a simulated annealing algorithm for multiobjective optimization problems , 2006, Math. Methods Oper. Res..

[21]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[22]  L. Younes Parametric Inference for imperfectly observed Gibbsian fields , 1989 .

[23]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[24]  Kathryn A. Dowsland,et al.  Simulated Annealing , 1989, Encyclopedia of GIS.

[25]  Charles B. Owen,et al.  Application of simulated annealing to the backpropagation model improves convergence , 1993, Defense, Security, and Sensing.

[26]  Eric B. Baum,et al.  Constructing Hidden Units Using Examples and Queries , 1990, NIPS.

[27]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .

[28]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[29]  Faming Liang,et al.  Annealing stochastic approximation Monte Carlo algorithm for neural network training , 2007, Machine Learning.

[30]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[31]  Robert E. Dorsey,et al.  Genetic algorithms for estimation problems with multiple optima , 1995 .

[32]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[33]  F. Liang On the use of stochastic approximation Monte Carlo for Monte Carlo integration , 2009 .

[34]  W H Wong,et al.  Dynamic weighting in Monte Carlo and optimization. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[35]  R. Carroll,et al.  Stochastic Approximation in Monte Carlo Computation , 2007 .

[36]  Han-Fu Chen Stochastic approximation and its applications , 2002 .

[37]  Head-Gordon,et al.  Toy model for protein folding. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[38]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[39]  D. Stroock,et al.  Asymptotics of the spectral gap with applications to the theory of simulated annealing , 1989 .

[40]  Christoph Dellago,et al.  Wang-Landau sampling with self-adaptive range. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[42]  W. Price Global optimization by controlled random search , 1983 .

[43]  Jun S. Liu,et al.  The Wang-Landau Algorithm for Monte Carlo computation in general state spaces , 2005 .

[44]  Robert L. Smith,et al.  Efficient Monte Carlo Procedures for Generating Points Uniformly Distributed over Bounded Regions , 1984, Oper. Res..

[45]  K. Lang,et al.  Learning to tell two spirals apart , 1988 .

[46]  C. Storey,et al.  Modified controlled random search algorithms , 1994 .

[47]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[48]  H. Haario,et al.  Simulated annealing process in general state space , 1991, Advances in Applied Probability.

[49]  Head-Gordon,et al.  Collective aspects of protein folding illustrated by a toy model. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[50]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[51]  Emile H. L. Aarts,et al.  Global optimization and simulated annealing , 1991, Math. Program..

[52]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[53]  J. Rosenthal Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo , 1995 .

[54]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[55]  Faming Liang,et al.  Annealing contour Monte Carlo algorithm for structure optimization in an off-lattice protein model. , 2004, The Journal of chemical physics.

[56]  Faming Liang,et al.  Annealing evolutionary stochastic approximation Monte Carlo for global optimization , 2011, Stat. Comput..

[57]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Seung-Yeon Kim,et al.  Re‐examination of structure optimization of off‐lattice protein AB models by conformational space annealing , 2008, J. Comput. Chem..

[59]  Zelda B. Zabinsky,et al.  A Numerical Evaluation of Several Stochastic Algorithms on Selected Continuous Global Optimization Test Problems , 2005, J. Glob. Optim..

[60]  Jia Su,et al.  Wang-Landau Algorithm , 2008 .

[61]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[62]  Andrew Brennan,et al.  Necessary and Sufficient Conditions , 2018, Logic in Wonderland.

[63]  Hsiao-Ping Hsu,et al.  Structure optimization in an off-lattice protein model. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[64]  Sanjeev R. Kulkarni,et al.  Necessary and sufficient conditions for convergence of stochastic approximation algorithms under arbitrary disturbances , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[65]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[66]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[67]  Faming Liang,et al.  Improving SAMC using smoothing methods: Theory and applications to Bayesian model selection problems , 2009, 0908.3553.

[68]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.