Chasing Ghosts: Competing with Stateful Policies
暂无分享,去创建一个
[1] F. R. Rosendaal,et al. Prediction , 2015, Journal of thrombosis and haemostasis : JTH.
[2] Uriel Feige. Why are Images Smooth? , 2015, ITCS.
[3] Yuval Peres,et al. Bandits with switching costs: T2/3 regret , 2013, STOC.
[4] Elad Hazan,et al. Better Rates for Any Adversarial Deterministic MDP , 2013, ICML.
[5] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[6] David Haussler,et al. How to use expert advice , 1993, STOC.
[7] Andrew Chi-Chih Yao,et al. Probabilistic computations: Toward a unified measure of complexity , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).
[8] Neri Merhav,et al. On sequential strategies for loss functions with memory , 2002, IEEE Trans. Inf. Theory.
[9] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[10] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .
[11] Moni Naor,et al. Differential privacy under continual observation , 2010, STOC '10.
[12] Nimrod Megiddo,et al. Combining expert advice in reactive environments , 2006, JACM.
[13] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[14] Neri Merhav,et al. Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.
[15] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[16] Andrew Drucker,et al. High-confidence predictions under adversarial uncertainty , 2011, TOCT.
[17] Neri Merhav,et al. Universal Prediction , 1998, IEEE Trans. Inf. Theory.
[18] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[19] Deeparnab Chakrabarty. Monotonicity Testing , 2016, Encyclopedia of Algorithms.
[20] Ambuj Tewari,et al. Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret , 2012, ICML.
[21] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[22] Csaba Szepesvári,et al. Online Markov Decision Processes Under Bandit Feedback , 2010, IEEE Transactions on Automatic Control.