Index Policies for Stochastic Search in a Forest with an Application to R&D Project Management

This paper concerns a stochastic search problem in a forest. As motivation, consider the issue of investing in a research-and-development project. Each activity that could be undertaken in the project is represented as an edge in a forest. Each edge has a cost of being attempted and a probability of success. An edge can be attempted if each of its predecessors has been attempted, and if each of those attempts has succeeded. The overall project succeeds if a path is found from a stem to a leaf of the forest all of whose edges are successful. Overall success yields an economic benefit. The problem is to find an investment strategy that maximizes expected utility, either with a linear or an exponential utility function. This problem will be shown to have a simple solution. Each edge will be assigned an index such that expected utility is maximized by attempting, at each opportunity, an edge whose index is most positive, terminating the search when no edges remain whose indices are positive. These indices are nested in a way that makes them quick to compute.

[1]  Wayne E. Smith Various optimizers for single‐stage production , 1956 .

[2]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[3]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[4]  K. Glazebrook Stochastic scheduling with order constraints , 1976 .

[5]  P. Whittle Arm-Acquiring Bandits , 1981 .

[6]  Lodewijk C. M. Kallenberg,et al.  A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index , 1986, Math. Oper. Res..

[7]  Michael N. Katehakis,et al.  The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[8]  Gideon Weiss,et al.  Branching Bandit Processes , 1988, Probability in the Engineering and Informational Sciences.

[9]  Daniel Granot,et al.  Optimal Sequencing and Resource Allocation in Research and Development Projects , 1991 .

[10]  R. Weber On the Gittins Index for Multiarmed Bandits , 1992 .

[11]  I. Karatzas,et al.  General Gittins index processes in discrete time. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[12]  J. Tsitsiklis A short proof of the Gittins index theorem , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[13]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[14]  G. Lynn,et al.  Marketing and Discontinuous Innovation: The Probe and Learn Process , 1996 .

[15]  A. Mandelbaum,et al.  Multi-armed bandits in discrete and continuous time , 1998 .

[16]  Kevin D. Glazebrook,et al.  Index Policies and a Novel Performance Space Structure for a Class of Generalised Branching Bandit Problems , 2000, Math. Oper. Res..