Decentralized multi-armed bandit with multiple distributed players
暂无分享,去创建一个
[1] Pierre Pepin,et al. The robustness of lognormal-based estimators of abundance , 1990 .
[2] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.
[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[4] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[5] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .
[6] Yi Gai,et al. Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).
[7] Michael A. Bean,et al. Probability: The Science of Uncertainty with Applications to Investments, Insurance, and Engineering , 2000 .
[8] W. E. Sharp,et al. A log-normal distribution of alluvial diamonds with an economic cutoff , 1976 .
[9] Hai Jiang,et al. Medium access in cognitive radio networks: A competitive multi-armed bandit framework , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.
[10] Qing Zhao,et al. Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.
[11] Brian M. Sadler,et al. A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.
[12] Ao Tang,et al. Opportunistic Spectrum Access with Multiple Users: Learning under Competition , 2010, 2010 Proceedings IEEE INFOCOM.