暂无分享,去创建一个
[1] P. Bickel. Efficient and Adaptive Estimation for Semiparametric Models , 1993 .
[2] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[3] Karthik Sridharan,et al. BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits , 2016, ICML.
[4] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[5] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[6] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.
[7] J. Robins,et al. Double/Debiased Machine Learning for Treatment and Causal Parameters , 2016, 1608.00060.
[8] Lihong Li,et al. Provable Optimal Algorithms for Generalized Linear Contextual Bandits , 2017, ArXiv.
[9] M. Sion. On general minimax theorems , 1958 .
[10] P. Robinson. ROOT-N-CONSISTENT SEMIPARAMETRIC REGRESSION , 1988 .
[11] T. Lai,et al. Theory and applications of multivariate self-normalized processes , 2009 .
[12] John Langford,et al. Off-policy evaluation for slate recommendation , 2016, NIPS.
[13] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[14] J. Robins,et al. Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers , 1992 .
[15] T. Lai,et al. Self-Normalized Processes: Limit Theory and Statistical Applications , 2001 .
[16] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[17] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[18] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[19] A. Tsiatis. Semiparametric Theory and Missing Data , 2006 .
[20] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[21] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[22] Akshay Krishnamurthy,et al. Efficient Algorithms for Adversarial Contextual Learning , 2016, ICML.
[23] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[24] W. Newey,et al. Double machine learning for treatment and causal parameters , 2016 .
[25] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[26] Ambuj Tewari,et al. From Ads to Interventions: Contextual Bandits in Mobile Health , 2017, Mobile Health - Sensors, Analytic Methods, and Applications.
[27] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[28] Akshay Krishnamurthy,et al. Contextual semibandits via supervised learning oracles , 2015, NIPS.
[29] Kristjan H. Greenewald,et al. Action Centered Contextual Bandits , 2017, NIPS.
[30] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[31] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[32] Sham M. Kakade,et al. Towards Minimax Policies for Online Linear Optimization with Bandit Feedback , 2012, COLT.
[33] Aad van der Vaart,et al. Higher order influence functions and minimax estimation of nonlinear functionals , 2008, 0805.3040.
[34] Thomas M. Cover,et al. Behavior of sequential predictors of binary sequences , 1965 .
[35] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.