Online Learning in Weakly Coupled Markov Decision Processes: A Convergence Time Study
暂无分享,去创建一个
[1] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[2] Hao Yu,et al. A Simple Parallel Algorithm with an O(1/t) Convergence Rate for General Convex Programs , 2015, SIAM J. Optim..
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Craig Boutilier,et al. Budget Allocation using Weakly Coupled, Constrained Markov Decision Processes , 2016, UAI.
[5] B. Fox. Markov Renewal Programming by Linear Fractional Programming , 1966 .
[6] Mengdi Wang,et al. Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning , 2016, ArXiv.
[7] B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications , 1982, Advances in Applied Probability.
[8] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[9] R. Srikant,et al. Asymptotically tight steady-state queue length bounds implied by drift conditions , 2011, Queueing Syst. Theory Appl..
[10] András György,et al. Online Learning in Markov Decision Processes with Changing Cost Sequences , 2014, ICML.
[11] Xiaohan Wei,et al. Online Convex Optimization with Stochastic Constraints , 2017, NIPS.
[12] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[13] Hao Yu,et al. Online Convex Optimization with Time-Varying Constraints , 2017, 1702.04783.
[14] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[15] Cédric Archambeau,et al. Adaptive Algorithms for Online Convex Optimization with Long-term Constraints , 2015, ICML.
[16] Constantine Caramanis,et al. Efficient Algorithms for Budget-Constrained Markov Decision Processes , 2014, IEEE Transactions on Automatic Control.
[17] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[18] Asuman E. Ozdaglar,et al. Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods , 2008, SIAM J. Optim..
[19] E. Altman. Constrained Markov Decision Processes , 1999 .
[20] Dimitri P. Bertsekas,et al. Convex Optimization Theory , 2009 .
[21] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[22] Rebecca Willett,et al. Online Markov Decision Processes With Kullback–Leibler Control Cost , 2014, IEEE Transactions on Automatic Control.
[23] Anand Sivasubramaniam,et al. Optimal power cost management using stored energy in data centers , 2011, SIGMETRICS.
[24] Alan Scheller-Wolf,et al. Exact analysis of the M/M/k/setup class of Markov chains via recursive renewal reward , 2013, SIGMETRICS '13.
[25] Xiaohan Wei,et al. Data Center Server Provision: Distributed Asynchronous Control for Coupled Renewal Systems , 2015, IEEE/ACM Transactions on Networking.
[26] Anshul Gandhi,et al. Dynamic Server Provisioning for Data Center Power Management , 2013 .
[27] Xiaohan Wei,et al. Online Learning in Weakly Coupled Markov Decision Processes , 2017, PERV.
[28] Qing Ling,et al. An Online Convex Optimization Approach to Proactive Network Resource Allocation , 2017, IEEE Transactions on Signal Processing.
[29] Michael J. Neely. Online fractional programming for Markov decision systems , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[30] Rong Jin,et al. Trading regret for efficiency: online convex optimization with long term constraints , 2011, J. Mach. Learn. Res..
[31] Xiaohan Wei,et al. Opportunistic Scheduling over Renewal Systems: An Empirical Method , 2016 .
[32] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[33] Csaba Szepesvari,et al. Markov Decision Processes under Bandit Feedback , 2015 .
[34] Shie Mannor,et al. Markov Decision Processes with Arbitrary Reward Processes , 2008, Math. Oper. Res..
[35] Tor Lattimore,et al. The Sample-Complexity of General Reinforcement Learning , 2013, ICML.
[36] Hao Yu,et al. A Low Complexity Algorithm with $O(\sqrt{T})$ Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints , 2016, ArXiv.
[37] Xiaohan Wei,et al. On the Theory and Application of Distributed Asynchronous Optimization over Weakly Coupled Renewal Systems , 2016 .