The Safe Bayesian - Learning the Learning Rate via the Mixability Gap
暂无分享,去创建一个
[1] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .
[2] A. V. D. Vaart. Asymptotic Statistics: Delta Method , 1998 .
[3] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[4] John Langford,et al. Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification , 2004, COLT.
[5] E. Mammen,et al. Smooth Discrimination Analysis , 1999 .
[6] A. V. D. Vaart,et al. Asymptotic Statistics: Frontmatter , 1998 .
[7] V. Vovk. Competitive On‐line Statistics , 2001 .
[8] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[9] A. V. D. Vaart,et al. Asymptotic Statistics: U -Statistics , 1998 .
[10] A. V. D. Vaart,et al. Misspecification in infinite-dimensional Bayesian statistics , 2006, math/0607023.
[11] Yoav Freund,et al. A Parameter-free Hedging Algorithm , 2009, NIPS.
[12] Andrew R. Barron,et al. Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.
[13] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[14] A. Barron,et al. Robustly Minimax Codes for Universal Data Compression , 1998 .
[15] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[16] Ziheng Yang,et al. Fair-balance paradox, star-tree paradox, and Bayesian phylogenetics. , 2007, Molecular biology and evolution.
[17] Wouter M. Koolen,et al. Adaptive Hedge , 2011, NIPS.
[18] David A. McAllester. PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.
[19] Tong Zhang,et al. Information-theoretic upper and lower bounds for statistical estimation , 2006, IEEE Transactions on Information Theory.
[20] A. Barron,et al. Estimation of mixture models , 1999 .
[21] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[22] Tong Zhang. From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.
[23] C. Shalizi. Dynamics of Bayesian Updating with Dependent Data and Misspecified Models , 2009, 0901.1342.
[24] O. Catoni. PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning , 2007, 0712.0248.
[25] Peter Grünwald,et al. Safe Learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity , 2011, COLT.
[26] Matthias Seeger,et al. PAC-Bayesian Generalization Error Bounds for GaussianPro ess Classi ationMatthias , 2002 .
[27] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[28] P. Grünwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .
[29] A. P. Dawid,et al. Present position and potential developments: some personal views , 1984 .