Theory and Applications of Hybrid Simulated Annealing

Local optimization techniques such as gradient-based methods and the expectation-maximization algorithm have an advantage of fast convergence but do not guarantee convergence to the global optimum. On the other hand, global optimization techniques based on stochastic approaches such as evolutionary algorithms and simulated annealing provide the possibility of global convergence, which is accomplished at the expense of computational and time complexity. This chapter aims at demonstrating how these two approaches can be effectively combined for improved convergence speed and quality of the solution. In particular, a hybrid method, called hybrid simulated annealing (HSA), is presented, where a simulated annealing algorithm is combined with local optimization methods. First, its general procedure and mathematical convergence properties are described. Then, its two example applications are presented, namely, optimization of hidden Markov models for visual speech recognition and optimization of radial basis function networks for pattern classification, in order to show how the HSA algorithm can be successfully adopted for solving real-world problems effectively. As an appendix, the source code for multi-dimensional Cauchy random number generation is provided, which is essential for implementation of the presented method.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  M. Inés Torres,et al.  Comparative Study of the Baum-Welch and Viterbi Training Algorithms Applied to Read and Spontaneous Speech Recognition , 2003, IbPRIA.

[3]  Cheol Hoon Park,et al.  GLOBAL OPTIMIZATION OF RADIAL BASIS FUNCTION NETWORKS BY HYBRID SIMULATED ANNEALING , 2010 .

[4]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[5]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Friedhelm Schwenker,et al.  Three learning phases for radial-basis-function networks , 2001, Neural Networks.

[7]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[8]  Kenneth Rose,et al.  Deterministically annealed design of hidden Markov model speech recognizers , 2001, IEEE Trans. Speech Audio Process..

[9]  Gerald W. Kimble,et al.  Information and Computer Science , 1975 .

[10]  Farzin Deravi,et al.  A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..

[11]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[12]  Michel Verleysen,et al.  Width optimization of the Gaussian kernels in Radial Basis Function Networks , 2002, ESANN.

[13]  Bruce E. Rosen,et al.  Genetic Algorithms and Very Fast Simulated Reannealing: A comparison , 1992 .

[14]  D. B. Paul Training of HMM recognizers by simulated annealing , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Cheol Hoon Park,et al.  Hybrid Simulated Annealing and Its Application to Optimization of Hidden Markov Models for Visual Speech Recognition , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[17]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[19]  Cheol Hoon Park,et al.  Robust Audio-Visual Speech Recognition Based on Late Integration , 2008, IEEE Transactions on Multimedia.

[20]  G. Marsaglia Choosing a Point from the Surface of a Sphere , 1972 .

[21]  D. Mitra,et al.  Convergence and finite-time behavior of simulated annealing , 1986, Advances in Applied Probability.

[22]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[23]  R. Yang,et al.  Convergence of the Simulated Annealing Algorithm for Continuous Global Optimization , 2000 .

[24]  Dongkyung Nam,et al.  n-Dimensional Cauchy Neighbor Generation for the Fast Simulated Annealing , 2004, IEICE Trans. Inf. Syst..

[25]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[26]  H. Szu Fast simulated annealing , 1987 .

[27]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[28]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[29]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[30]  Hong Chen,et al.  Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks , 1993, IEEE Trans. Neural Networks.

[31]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[32]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[33]  David Burshtein,et al.  A discriminative training algorithm for hidden Markov models , 2004, IEEE Transactions on Speech and Audio Processing.