Decision making mechanisms and stopping times in a generalized information system

Abstract A mechanism for a decision maker ethically bound to pursue the best available course of action is studied. The model allows for feedback information to be input for each cycle of decision making. This type of information-feedback loop is relevant to many real decision making situations. Equations describing the updated expected utility in terms of the learning factor are given and a discussion follows as to when learning takes place. It is then shown that these estimates converge to the appropriate values. Moreover these estimators form a martingale or a submartingale, thus allowing the analysis of when and how the decision makers can terminate the feedback decision making loop. Several strategies are discussed as well as the corresponding probabilities of ending the loop in a finite amount of time.