Use of random admissible values for control in iterative dynamic programming

Iterative dynamic programming employing region contraction, where instead of uniformly chosen admissible values for control, randomly generated admissible values for control are used, is examined for optimal control. The use of randomly generated control values becomes necessary when the number of control variables is very large. Two examples (Series of chemical reactors, mathematical system) are used to examine the viability of this method of choosing candidates for control values. Choosing control values at random becomes especially useful to keep the number of trajectories to be evaluated and compared reasonably small when the number of control variables is very large. In the numerical example where there are 20 state variables and 20 control variables, convergence to the optimum was fast even when only 100 randomly chosen control values were used at each grid point