Applications to Learning, State Dependent Noise, and Queueing

This chapter deals with more specific classes of examples, which are of increasing importance in current applications in many areas of technology. They are described in somewhat more detail than the examples of Chapter 1 are, and the illustration(s) given for each class are typical of those in a rapidly increasing literature. Section 1 deals with a problem in learning theory: the learning of an optimal hunting strategy by an animal, based on the history of successes and failures in repeated attempts to feed itself efficiently. Section 2 concerns the “learning” or “training” of a neural network. In the training phase, a random series of inputs is presented to the network, and there is a desirable response to each input. The problem is to adjust the weights in the network to minimize the average distance between the actual and desired responses. This is done by a training procedure, where the weights are adjusted after each (input, output) pair is observed. Loosely speaking, the increments in the weights are proportional to stochastic estimates of the derivative of the error with respect to the weights. Section 3 deals with an optimization problem for a controlled Markov chain model, where the transition probabilities are not known and one wishes to learn the optimal strategy in the course of the system’s operation.