Rollout Algorithms for Combinatorial Optimization

We consider the approximate solution of discrete optimization problems using procedures that are capable of magnifying the effectiveness of any given heuristic algorithm through sequential application. In particular, we embed the problem within a dynamic programming framework, and we introduce several types of rollout algorithms, which are related to notions of policy iteration. We provide conditions guaranteeing that the rollout algorithm improves the performance of the original heuristic algorithm. The method is illustrated in the context of a machine maintenance and repair problem.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Krishna R. Pattipati,et al.  Application of heuristic search and information theory to sequential fault diagnosis , 1990, IEEE Trans. Syst. Man Cybern..

[3]  Fred W. Glover,et al.  A user's guide to tabu search , 1993, Ann. Oper. Res..

[4]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5]  Gerald Tesauro,et al.  On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[6]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[7]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .