Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits
暂无分享,去创建一个
Zhengyuan Zhou | Fan Zhang | Jose Blanchet | Nian Si | J. Blanchet | Zhengyuan Zhou | Nian Si | Fan Zhang
[1] Karthyek R. A. Murthy,et al. Quantifying Distributional Model Risk Via Optimal Transport , 2016, Math. Oper. Res..
[2] Peter W. Glynn,et al. Improving predictions of pediatric surgical durations with supervised learning , 2017, International Journal of Data Science and Analytics.
[3] Daniel Kuhn,et al. Distributionally Robust Logistic Regression , 2015, NIPS.
[4] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[5] Viet Anh Nguyen,et al. Wasserstein Distributionally Robust Kalman Filtering , 2018, NeurIPS.
[6] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[7] Yongpei Guan,et al. Data-driven risk-averse stochastic optimization with Wasserstein metric , 2018, Oper. Res. Lett..
[8] Elena Smirnova,et al. Distributionally Robust Counterfactual Risk Minimization , 2019, AAAI.
[9] Michael R Kosorok,et al. Residual Weighted Learning for Estimating Individualized Treatment Rules , 2015, Journal of the American Statistical Association.
[10] A. Zeevi,et al. A Linear Response Bandit Problem , 2013 .
[11] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.
[12] Toru Kitagawa,et al. Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice , 2015 .
[13] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[14] Melvyn Sim,et al. The Price of Robustness , 2004, Oper. Res..
[15] Henry Lam,et al. The empirical likelihood approach to quantifying uncertainty in sample average approximation , 2017, Oper. Res. Lett..
[16] Alessandro Lazaric,et al. Linear Thompson Sampling Revisited , 2016, AISTATS.
[17] Nathan Kallus,et al. Confounding-Robust Policy Improvement , 2018, NeurIPS.
[18] Peter S. Fader,et al. Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments , 2016, Mark. Sci..
[19] D. Kuhn,et al. Data-Driven Chance Constrained Programs over Wasserstein Balls , 2018, Operations Research.
[20] Philippe Rigollet,et al. Nonparametric Bandits with Covariates , 2010, COLT.
[21] Alexander Shapiro,et al. Distributionally Robust Stochastic Programming , 2017, SIAM J. Optim..
[22] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.
[23] E. L. Lehmann,et al. Theory of point estimation , 1950 .
[24] Insoon Yang,et al. Wasserstein Distributionally Robust Stochastic Control: A Data-Driven Approach , 2018, IEEE Transactions on Automatic Control.
[25] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[26] Benjamin Pfaff,et al. Perturbation Analysis Of Optimization Problems , 2016 .
[27] Dimitris Bertsimas,et al. Optimal classification trees , 2017, Machine Learning.
[28] John Duchi,et al. Statistics of Robust Optimization: A Generalized Empirical Likelihood Approach , 2016, Math. Oper. Res..
[29] Robert D. Nowak,et al. Scalable Generalized Linear Bandits: Online Computation and Hashing , 2017, NIPS.
[30] Zhengyuan Zhou,et al. Offline Multi-Action Policy Learning: Generalization and Optimization , 2018, Oper. Res..
[31] Mohsen Bayati,et al. Online Decision Making with High-Dimensional Covariates , 2020, Oper. Res..
[32] Min Zhang,et al. Estimating optimal treatment regimes from a classification perspective , 2012, Stat.
[33] Vasilis Syrgkanis,et al. Semi-Parametric Efficient Policy Learning with Continuous Actions , 2019, NeurIPS.
[34] Daniel Kuhn,et al. Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator , 2018, Oper. Res..
[35] Karthik Sridharan,et al. BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits , 2016, ICML.
[36] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[37] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.
[38] Huan Xu,et al. Robust Hypothesis Testing Using Wasserstein Uncertainty Sets , 2018, NeurIPS.
[39] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[40] C. Blumberg. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .
[41] Daniel Kuhn,et al. Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.
[42] C. Tomlin,et al. Stochastic Control With Uncertain Parameters via Chance Constrained Control , 2016, IEEE Transactions on Automatic Control.
[43] Yinyu Ye,et al. Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems , 2010, Oper. Res..
[44] D. Aldous. The Central Limit Theorem for Real and Banach Valued Random Variables , 1981 .
[45] Eric B. Laber,et al. Doubly Robust Learning for Estimating Individualized Treatment with Censored Data. , 2015, Biometrika.
[46] Zhaolin Hu,et al. Kullback-Leibler divergence constrained distributionally robust optimization , 2012 .
[47] Jaeho Lee,et al. Minimax Statistical Learning with Wasserstein distances , 2017, NeurIPS.
[48] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[49] Renyuan Xu,et al. Learning in Generalized Linear Contextual Bandits with Stochastic Delays , 2019, NeurIPS.
[50] Soumyadip Ghosh,et al. Robust Analysis in Stochastic Simulation: Computation and Performance Guarantees , 2015, Oper. Res..
[51] Olivier Chapelle,et al. Modeling delayed feedback in display advertising , 2014, KDD.
[52] Donglin Zeng,et al. Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.
[53] Xi Chen,et al. Online EXP3 Learning in Adversarial Bandits with Delayed Feedback , 2019, NeurIPS.
[54] Lihong Li,et al. Provable Optimal Algorithms for Generalized Linear Contextual Bandits , 2017, ArXiv.
[55] Zhengyuan Zhou,et al. Balanced Linear Contextual Bandits , 2018, AAAI.
[56] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .
[57] M. de Rijke,et al. Deep Learning with Logged Bandit Feedback , 2018, ICLR.
[58] Güzin Bayraksan,et al. Data-Driven Stochastic Programming Using Phi-Divergences , 2015 .
[59] A. Kleywegt,et al. Distributionally Robust Stochastic Optimization with Wasserstein Distance , 2016, Math. Oper. Res..
[60] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[61] John C. Duchi,et al. Distributionally Robust Losses Against Mixture Covariate Shifts , 2019 .
[62] Susan Athey,et al. Estimation Considerations in Contextual Bandits , 2017, ArXiv.
[63] Dimitris Bertsimas,et al. A Learning Approach for Interactive Marketing to a Customer Segment , 2007, Oper. Res..
[64] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[65] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[66] John C. Duchi,et al. Learning Models with Uniform Performance via Distributionally Robust Optimization , 2018, ArXiv.
[67] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[68] M. Staib,et al. Distributionally Robust Deep Learning as a Generalization of Adversarial Training , 2017 .