Discrete optimization via gradient-based adaptive stochastic search methods

ABSTRACT Gradient-based Adaptive Stochastic Search (GASS) is a new stochastic search optimization algorithm that has recently been proposed. It iteratively searches promising candidate solutions through a population of samples generated from a parameterized probabilistic model on the solution space, and updates the parameter of the probabilistic model based on a direct gradient method. Under the framework of GASS, we propose two discrete optimization algorithms: discrete Gradient-based Adaptive Stochastic Search (discrete-GASS) and annealing Gradient-based Adaptive Stochastic Search (annealing-GASS). In discrete-GASS, we transform the discrete optimization problem into a continuous optimization problem on the parameter space of a family of independent discrete distributions, and apply a gradient-based method to find the optimal parameter, such that the corresponding distribution has the best capability to generate optimal solution(s) to the original discrete problem. In annealing-GASS, we use a Boltzmann distribution as the parameterized probabilistic model, and propose a gradient-based temperature schedule that changes adaptively with respect to the current performance of the algorithm. We show convergence of both discrete-GASS and annealing-GASS under appropriate conditions. Numerical results on several benchmark optimization problems and the traveling salesman problem indicate that both algorithms perform competitively against a number of other algorithms, including model reference adaptive search, the cross-entropy method, and multi-start simulated annealing with different temperature schedules.

[1]  Zelda B. Zabinsky,et al.  The interacting-particle algorithm with dynamic heating and cooling , 2009, J. Glob. Optim..

[2]  Jiaqiao Hu,et al.  Gradient-Based Adaptive Stochastic Search for Non-Differentiable Optimization , 2013, IEEE Transactions on Automatic Control.

[3]  Robert L. Smith,et al.  Pattern discrete and mixed Hit-and-Run for global optimization , 2011, J. Glob. Optim..

[4]  Michael C. Fu,et al.  A Model Reference Adaptive Search Method for Global Optimization , 2007, Oper. Res..

[5]  H. Cohn,et al.  Simulated Annealing: Searching for an Optimal Temperature Schedule , 1999, SIAM J. Optim..

[6]  M. Kupperman PROBABILITIES OF HYPOTHESES AND INFORMATION-STATISTICS IN SAMPLING FROM EXPONENTIAL-CLASS POPULATIONS , 1958 .

[7]  Leyuan Shi,et al.  Nested Partitions Method for Global Optimization , 2000, Oper. Res..

[8]  Robert L. Smith,et al.  Simulated annealing for constrained global optimization , 1994, J. Glob. Optim..

[9]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[10]  H. Kushner Stochastic approximation: a survey , 2010 .

[11]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[12]  Anne Auger,et al.  Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles , 2011, J. Mach. Learn. Res..

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[15]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[16]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[17]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[18]  Yunmei Chen,et al.  Projection Onto A Simplex , 2011, 1101.6081.

[19]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[20]  Zelda B. Zabinsky,et al.  A Numerical Evaluation of Several Stochastic Algorithms on Selected Continuous Global Optimization Test Problems , 2005, J. Glob. Optim..

[21]  Robert L. Smith,et al.  Simulated Annealing and Adaptive Search in Global Optimization , 1994, Probability in the Engineering and Informational Sciences.

[22]  Jorge Nocedal,et al.  Sample size selection in optimization methods for machine learning , 2012, Math. Program..

[23]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[24]  Fred Glover,et al.  Tabu Search: A Tutorial , 1990 .

[25]  Raghu Pasupathy,et al.  On adaptive sampling rules for stochastic recursions , 2014, Proceedings of the Winter Simulation Conference 2014.

[26]  Claude J. P. Bélisle Convergence theorems for a class of simulated annealing algorithms on ℝd , 1992 .

[27]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[28]  Yee Whye Teh,et al.  Distributed Bayesian Learning with Stochastic Natural Gradient Expectation Propagation and the Posterior Server , 2015, J. Mach. Learn. Res..

[29]  Robert L. Smith,et al.  An analytically derived cooling schedule for simulated annealing , 2007, J. Glob. Optim..