暂无分享,去创建一个
John C. S. Lui | John C.S. Lui | Hong Xie | Yishi Lin | Li Ye | Hong Xie | Yishi Lin | Li Ye
[1] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[2] Thorsten Joachims,et al. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.
[3] Zoran Popovic,et al. The Queue Method: Handling Delay, Heuristics, Prior Data, and Evaluation in Bandits , 2015, AAAI.
[4] Vianney Perchet,et al. Bounded regret in stochastic multi-armed bandits , 2013, COLT.
[5] Elias Bareinboim,et al. Bandits with Unobserved Confounders: A Causal Approach , 2015, NIPS.
[6] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[7] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[8] L. Hansen. Large Sample Properties of Generalized Method of Moments Estimators , 1982 .
[9] David M. Blei,et al. The Deconfounded Recommender: A Causal Inference Approach to Recommendation , 2018, ArXiv.
[10] Heinrich Jiang,et al. Uniform Convergence Rates for Kernel Density Estimation , 2017, ICML.
[11] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.
[12] Lihong Li. Offline Evaluation and Optimization for Interactive Systems , 2015, WSDM.
[13] Tor Lattimore,et al. Causal Bandits: Learning Good Interventions via Causal Inference , 2016, NIPS.
[14] Melody Y. Guan,et al. Nonparametric Stochastic Contextual Bandits , 2018, AAAI.
[15] Wei Chu,et al. An unbiased offline evaluation of contextual bandit algorithms with generalized linear models , 2011 .
[16] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[17] Jack Bowden,et al. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.
[18] Susan Athey,et al. Estimation Considerations in Contextual Bandits , 2017, ArXiv.
[19] Elias Bareinboim,et al. Counterfactual Data-Fusion for Online Reinforcement Learners , 2017, ICML.
[20] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[21] P. Austin. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.
[22] Anmol Bhasin,et al. From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks , 2015, KDD.
[23] E. Duflo,et al. How Much Should We Trust Differences-in-Differences Estimates? , 2001 .
[24] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[25] D. McCaffrey,et al. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.
[26] J. Borwein,et al. Uniform Bounds for the Complementary Incomplete Gamma Function , 2009 .
[27] S. Zahl. Bounds for the Central Limit Theorem Error , 1966 .
[28] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[29] D. Rubin. Causal Inference Using Potential Outcomes , 2005 .
[30] Elizabeth A. Stuart,et al. An Introduction to Sensitivity Analysis for Unobserved Confounding in Nonexperimental Prevention Research , 2013, Prevention Science.
[31] Elizabeth A Stuart,et al. Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.
[32] P. Billingsley,et al. Probability and Measure , 1980 .
[33] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[34] Shi Dong,et al. An Information-Theoretic Analysis for Thompson Sampling with Many Actions , 2018, NeurIPS.
[35] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[36] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[37] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[38] S. Athey,et al. Generalized random forests , 2016, The Annals of Statistics.
[39] Huazheng Wang,et al. Learning Hidden Features for Contextual Bandits , 2016, CIKM.
[40] Doina Precup,et al. Algorithms for multi-armed bandit problems , 2014, ArXiv.
[41] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[42] D. Rubin,et al. Causal Inference for Statistics, Social, and Biomedical Sciences: Sensitivity Analysis and Bounds , 2015 .
[43] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[44] Stefan Wager,et al. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.
[45] Raphaël Féraud,et al. Random Forest for the Contextual Bandit Problem , 2015, AISTATS.
[46] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[47] John Langford,et al. Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback , 2019, ICML.
[48] Mélanie Frappier,et al. The Book of Why: The New Science of Cause and Effect , 2018, Science.
[49] Thorsten Joachims,et al. Multi-armed Bandit Problems with History , 2012, AISTATS.