Dynamic programming and adaptive processes: Mathematical foundation

In many engineering, economic, biological, and statistical control processes, a decision-making device is called upon to perform tinder various conditions of uncertainty regarding underlying physical processes. These conditions range from complete knowledge to total ignorance. As the process unfolds, additional information may become available to the controlling element, which then has the possibility of “learning” to improve its performance based upon experience; i.e., the controlling element may adapt itself to its environment. On a grand scale, situations of this type occur in the development of physical theories throgh the mutual interplay of experimentation and theory; on a smaller scale they occur in connection with the design of learning servomechanisms and adaptive filters. The central purpose of this paper is to lay a foundation for the mathematical treatment of broad classes of such adaptive processes. This is accomplished through use of the concepts of dynamic programming. Subsequent papers will be devoted to specific applications in different fields and various theoretical extensions.