Denumerable-Armed Bandits

This paper studies the class of denumerable-armed (i.e., finite- or countably infinite-armed) Bandit problems with independent arms and geometric discounting over an infinite horizon in which each arm generates rewards according to one of a finite number of distributions. The authors derive certain continuity and curvature properties of the Gittins Index, and provide necessary and sufficient conditions under which this index characterizes the optimal strategies. They then show that at each point in time the arm selected by an optimal strategy will, with positive probability, remain an optimal selection forever. Copyright 1992 by The Econometric Society.