Fast Online Learning of Antijamming and Jamming Strategies
暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] K. J. Ray Liu,et al. An anti-jamming stochastic game for cognitive radio networks , 2011, IEEE Journal on Selected Areas in Communications.
[4] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[5] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[6] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[7] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[8] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[9] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[10] L. Haan,et al. Extreme value theory : an introduction , 2006 .
[11] H. T. Kung,et al. Competing Mobile Network Game: Embracing antijamming and jamming strategies with reinforcement learning , 2013, 2013 IEEE Conference on Communications and Network Security (CNS).
[12] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[13] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[14] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[15] R. Bellman. Dynamic programming. , 1957, Science.
[16] Ronald L. Rivest,et al. Simulation results for a new two-armed bandit heuristic , 1994, Annual Conference Computational Learning Theory.
[17] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .
[18] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[19] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[20] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[21] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[22] H. T. Kung,et al. Optimizing media access strategy for competing cognitive radio networks , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).