暂无分享,去创建一个
Antoine Chambaz | Nathan Kallus | Mark van der Laan | Aurélien F. Bibaut | Maria Dimakopoulou | Aur'elien Bibaut | Nathan Kallus | M. Laan | A. Chambaz | Maria Dimakopoulou
[1] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[2] Ambuj Tewari,et al. Sequential complexities and uniform martingale laws of large numbers , 2015 .
[3] Nathan Kallus,et al. Fast Rates for Contextual Linear Optimization , 2020, Manag. Sci..
[4] Soumendu Sundar Mukherjee,et al. Weak convergence and empirical processes , 2019 .
[5] Stefan Wager,et al. Policy Learning With Observational Data , 2017, Econometrica.
[6] M. J. van der Laan,et al. STATISTICAL INFERENCE FOR THE MEAN OUTCOME UNDER A POSSIBLY NON-UNIQUE OPTIMAL TREATMENT STRATEGY. , 2016, Annals of statistics.
[7] Mehrdad Farajtabar,et al. More Robust Doubly Robust Off-policy Evaluation , 2018, ICML.
[8] Stefan Wager,et al. Confidence intervals for policy evaluation in adaptive experiments , 2021, Proceedings of the National Academy of Sciences.
[9] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[10] Yi Su,et al. CAB: Continuous Adaptive Blending for Policy Evaluation and Learning , 2019, ICML.
[11] B. Ozler. A sequential and adaptive experiment to increase the uptake of long-acting reversible contraceptives in Cameroon , 2018, AEA Randomized Controlled Trials.
[12] Toru Kitagawa,et al. Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice , 2015 .
[13] Philippe Rigollet,et al. Nonparametric Bandits with Covariates , 2010, COLT.
[14] Aurélien F. Bibaut,et al. Fast rates for empirical risk minimization over c\`adl\`ag functions with bounded sectional variation norm , 2019 .
[15] E. Mammen,et al. Smooth Discrimination Analysis , 1999 .
[16] Masatoshi Uehara,et al. Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning , 2019, NeurIPS.
[17] Mark J van der Laan,et al. Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis , 2008, The international journal of biostatistics.
[18] S. R. Jammalamadaka,et al. Empirical Processes in M-Estimation , 2001 .
[19] Ambuj Tewari,et al. From Ads to Interventions: Contextual Bandits in Mobile Health , 2017, Mobile Health - Sensors, Analytic Methods, and Applications.
[20] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[21] Ramon van Handel. On the minimal penalty for Markov order estimation , 2009, ArXiv.
[22] J. Robins,et al. Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.
[23] Masatoshi Uehara,et al. Optimal Off-Policy Evaluation from Multiple Logging Policies , 2020, ICML.
[24] Susan Athey,et al. Estimation Considerations in Contextual Bandits , 2017, ArXiv.
[25] Akshay Krishnamurthy,et al. Contextual bandits with surrogate losses: Margin bounds and efficient algorithms , 2018, NeurIPS.
[26] Vasilis Syrgkanis,et al. Orthogonal Statistical Learning , 2019, The Annals of Statistics.
[27] Mohsen Bayati,et al. Dynamic Pricing with Demand Covariates , 2016, 1604.07463.
[28] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.
[29] Masatoshi Uehara,et al. Fast Rates for the Regret of Offline Reinforcement Learning , 2021, COLT.
[30] B. Karrer,et al. AE: A domain-agnostic platform for adaptive experimentation , 2018 .
[31] Donglin Zeng,et al. Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.
[32] Nathan Kallus,et al. Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes , 2019, COLT.
[33] Madeleine Udell,et al. Dynamic Assortment Personalization in High Dimensions , 2016, Oper. Res..
[34] A. Zeevi,et al. A Linear Response Bandit Problem , 2013 .
[35] Mohsen Bayati,et al. Online Decision-Making with High-Dimensional Covariates , 2015 .
[36] David Simchi-Levi,et al. Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability , 2020, SSRN Electronic Journal.
[37] Maximilian Kasy,et al. Adaptive Treatment Assignment in Experiments for Policy Choice , 2019, Econometrica.
[38] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[39] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[40] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .
[41] M. Ossiander,et al. A Central Limit Theorem Under Metric Entropy with $L_2$ Bracketing , 1987 .
[42] A sequential and adaptive experiment to increase the uptake of long-acting reversible contraceptives in Cameroon , 2018, AEA Randomized Controlled Trials.
[43] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[44] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.
[45] Edward H. Kennedy. Optimal doubly robust estimation of heterogeneous causal effects , 2020, 2004.14497.
[46] Antoine Chambaz,et al. Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits , 2020, UAI.
[47] M. Davidian,et al. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data , 2009, Biometrika.
[48] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[49] D. Freedman,et al. Weighting Regressions by Propensity Scores , 2008, Evaluation review.
[50] John Langford,et al. Doubly Robust Policy Evaluation and Optimization , 2014, ArXiv.
[51] Vianney Perchet,et al. The multi-armed bandit problem with covariates , 2011, ArXiv.
[52] Zhengyuan Zhou,et al. Policy Learning with Adaptively Collected Data , 2021, ArXiv.
[53] Alexander Rakhlin,et al. Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles , 2020, ICML.
[54] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[55] P. Bartlett,et al. Local Rademacher complexities , 2005, math/0508275.