Tracking Adversarial Targets
暂无分享,去创建一个
Varun Kanade | Peter L. Bartlett | Yasin Abbasi-Yadkori | P. Bartlett | Yasin Abbasi-Yadkori | Varun Kanade
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] T. Lai,et al. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .
[3] Han-Fu Chen,et al. Optimal adaptive control and consistent parameter estimates for ARMAX model with quadratic cost , 1986, 1986 25th IEEE Conference on Decision and Control.
[4] T. Lai,et al. Asymptotically efficient self-tuning regulators , 1987 .
[5] Han-Fu Chen,et al. Optimal adaptive control and consistent parameter estimates for ARMAX model withquadratic cost , 1987 .
[6] Han-Fu Chen,et al. Identification and adaptive control for systems with unknown orders, delay, and coefficients , 1990 .
[7] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[8] Claude-Nicolas Fiechter,et al. PAC adaptive control of linear systems , 1997, COLT '97.
[9] P. Kumar,et al. Adaptive Linear Quadratic Gaussian Control: The Cost-Biased Approach Revisited , 1998 .
[10] Zhiliang Ying,et al. EFFICIENT RECURSIVE ESTIMATION AND ADAPTIVE CONTROL IN STOCHASTIC REGRESSION AND , 2006 .
[11] S. Bittanti,et al. ADAPTIVE CONTROL OF LINEAR TIME INVARIANT SYSTEMS: THE "BET ON THE BEST" PRINCIPLE ∗ , 2006 .
[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[13] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[14] Csaba Szepesvari,et al. The Online Loop-free Stochastic Shortest-Path Problem , 2010, Annual Conference Computational Learning Theory.
[15] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[16] András György,et al. The adversarial stochastic shortest path problem with unknown transition probabilities , 2012, AISTATS.
[17] Peter L. Bartlett,et al. Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions , 2013, NIPS.
[18] Csaba Szepesvari,et al. Markov Decision Processes under Bandit Feedback , 2015 .