Bayesian Counterfactual Risk Minimization
暂无分享,去创建一个
[1] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[2] Yu-Xiang Wang,et al. Imitation-Regularized Offline Learning , 2019, AISTATS.
[3] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[4] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[5] David A. McAllester. Simplified PAC-Bayesian Margin Bounds , 2003, COLT.
[6] D. Horvitz,et al. A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .
[7] John Shawe-Taylor,et al. PAC-Bayesian Inequalities for Martingales , 2011, IEEE Transactions on Information Theory.
[8] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] François Laviolette,et al. PAC-Bayesian learning of linear classifiers , 2009, ICML '09.
[11] O. Catoni. PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning , 2007, 0712.0248.
[12] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[13] J. Atchison,et al. Logistic-normal distributions:Some properties and uses , 1980 .
[14] Shiliang Sun,et al. PAC-Bayes bounds for stable algorithms with instance-dependent priors , 2018, NeurIPS.
[15] M. de Rijke,et al. Deep Learning with Logged Bandit Feedback , 2018, ICLR.
[16] John Shawe-Taylor,et al. Distribution-Dependent PAC-Bayes Priors , 2010, ALT.
[17] Dacheng Tao,et al. Algorithmic Stability and Hypothesis Complexity , 2017, ICML.
[18] John Shawe-Taylor,et al. PAC-Bayesian Analysis of Contextual Bandits , 2011, NIPS.
[19] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[20] John Shawe-Taylor,et al. PAC Bayes and Margins , 2003 .
[21] Thorsten Joachims,et al. The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.
[22] Shiliang Sun,et al. PAC-bayes bounds with data dependent priors , 2012, J. Mach. Learn. Res..
[23] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[24] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[25] David A. McAllester. PAC-Bayesian model averaging , 1999, COLT '99.
[26] Gintare Karolina Dziugaite,et al. Entropy-SGD optimizes the prior of a PAC-Bayes bound: Data-dependent PAC-Bayes priors via differential privacy , 2017, NeurIPS.
[27] E. Ionides. Truncated Importance Sampling , 2008 .
[28] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[29] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[30] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[31] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[32] Matthias W. Seeger,et al. PAC-Bayesian Generalisation Error Bounds for Gaussian Process Classification , 2003, J. Mach. Learn. Res..
[33] Yevgeny Seldin,et al. PAC-Bayes-Empirical-Bernstein Inequality , 2013, NIPS.
[34] John Langford,et al. The offset tree for learning with partial labels , 2008, KDD.