Statistics of the Kolkata Paise Restaurant problem

We study the dynamics of a few stochastic learning strategies for the 'Kolkata Paise Restaurant' problem, where N agents choose among N equally priced but differently ranked restaurants every evening, such that each agent tries to get dinner in the best restaurant (with each restaurant serving only one customer and the rest of the customers arriving there going without dinner that evening). We consider the learning strategies to be similar for all the agents, and assume that each follows the same probabilistic or stochastic strategy dependent on information about past successes in the game. We show that some 'naive' strategies lead to much better utilization of services than some relatively 'smarter' strategies. We also show that a service utilization fraction as high as 0.80 can result for a stochastic strategy, where each agent sticks to his past choice (independent of success achieved or not, with probability decreasing inversely in the past crowd size). The numerical results for the utilization fraction of the services in some limiting cases are analytically examined.