Online Learning Schemes for Power Allocation in Energy Harvesting Communications
暂无分享,去创建一个
[1] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[2] Bhaskar Krishnamachari,et al. Online learning of power allocation policies in energy harvesting communications , 2016, 2016 International Conference on Signal Processing and Communications (SPCOM).
[3] Biplab Sikdar,et al. Energy efficient transmission strategies for Body Sensor Networks with energy harvesting , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.
[4] Apostolos Burnetas,et al. Optimal Adaptive Policies for Markov Decision Processes , 1997, Math. Oper. Res..
[5] Xiaodong Wang,et al. Communication of Energy Harvesting Tags , 2012, IEEE Transactions on Communications.
[6] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[7] Stephen P. Boyd,et al. CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..
[8] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.
[9] Rui Zhang,et al. Optimal Energy Allocation for Wireless Communications With Energy Harvesting Constraints , 2011, IEEE Transactions on Signal Processing.
[10] Kaibin Huang,et al. Energy Harvesting Wireless Communications: A Review of Recent Advances , 2015, IEEE Journal on Selected Areas in Communications.
[11] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[12] V. Climenhaga. Markov chains and mixing times , 2013 .
[13] Jing Yang,et al. Transmission with Energy Harvesting Nodes in Fading Wireless Channels: Optimal Policies , 2011, IEEE Journal on Selected Areas in Communications.
[14] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[15] Roy D. Yates,et al. A generic model for optimizing single-hop transmission policy of replenishable sensors , 2009, IEEE Transactions on Wireless Communications.
[16] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[17] Neelesh B. Mehta,et al. Transmit Power Control Policies for Energy Harvesting Sensors With Retransmissions , 2013, IEEE Journal of Selected Topics in Signal Processing.
[18] Jing Yang,et al. Optimal Packet Scheduling in an Energy Harvesting Communication System , 2010, IEEE Transactions on Communications.
[19] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[20] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[21] Prasanna Chaporkar,et al. Optimal power allocation for a renewable energy source , 2011, 2012 National Conference on Communications (NCC).
[22] B. Krishnamachari,et al. Efficient Scheduling for Energy-Delay Tradeoff on a Time-Slotted Channel , 2015 .
[23] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[24] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[25] Aylin Yener,et al. Optimum Transmission Policies for Battery Limited Energy Harvesting Nodes , 2010, IEEE Transactions on Wireless Communications.
[26] Sattar Vakili,et al. Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems , 2011, IEEE Journal of Selected Topics in Signal Processing.
[27] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[28] Bhaskar Krishnamachari,et al. Stochastic Contextual Bandits with Known Reward Functions , 2016, ArXiv.
[29] Ambuj Tewari,et al. Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs , 2007, NIPS.