论文信息 - Multi-Armed Bandits with Betting

Multi-Armed Bandits with Betting

In this paper we consider an extension where the gambler has, at each round, K coins available for play, and the slot machines accept bets. If the player bets m coins on a machine, then the machine will return m times the payoff of that round. It is important to note that betting m coins on a machine results in obtaining a single sample from the rewards distribution of that machine (multiplied by m), not m independent samples. At each round, the gambler must divide all of his or hersK coins among the machines in such a way as to maximize the total expected payoff.

Alexandru Niculescu-Mizil

[1] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[2] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[4] Robert D. Kleinberg,et al. Competitive collaborative learning , 2005, Journal of computer and system sciences (Print).

[5] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .