暂无分享,去创建一个
Mark D. Reid | Peter Grünwald | Robert C. Williamson | Tim van Erven | Nishant A. Mehta | R. C. Williamson | M. Reid | P. Grünwald | T. Erven
[1] Yu. V. Prokhorov. Convergence of Random Processes and Limit Theorems in Probability Theory , 1956 .
[2] H. Richter. Parameterfreie Abschätzung und Realisierung von Erwartungswerten , 1957 .
[3] W. J. Studden,et al. Tchebycheff Systems: With Applications in Analysis and Statistics. , 1967 .
[4] Gerald S. Rogers,et al. Mathematical Statistics: A Decision Theoretic Approach , 1967 .
[5] J. Kemperman. The General Moment Problem, A Geometric Approach , 1968 .
[6] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[7] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[8] V. Vapnik,et al. Necessary and Sufficient Conditions for the Uniform Convergence of Means to their Expectations , 1982 .
[9] A. Barron. Are Bayes Rules Consistent in Information , 1987 .
[10] Thomas M. Cover,et al. Open Problems in Communication and Computation , 2011, Springer New York.
[11] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[12] Andrew R. Barron,et al. Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.
[13] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.
[14] Peter L. Bartlett,et al. The importance of convexity in learning with squared loss , 1998, COLT '96.
[15] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[16] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .
[17] Mathukumalli Vidyasagar,et al. A Theory of Learning and Generalization , 1997 .
[18] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[19] Peter L. Bartlett,et al. The Importance of Convexity in Learning with Squared Loss , 1998, IEEE Trans. Inf. Theory.
[20] Yuhong Yang,et al. Information-theoretic determination of minimax rates of convergence , 1999 .
[21] A. Barron,et al. Estimation of mixture models , 1999 .
[22] Manfred K. Warmuth,et al. Averaging Expert Predictions , 1999, EuroCOLT.
[23] Peter Grünwald. Viewing all models as “probabilistic” , 1999, COLT '99.
[24] E. Mammen,et al. Smooth Discrimination Analysis , 1999 .
[25] A. V. D. Vaart,et al. Convergence rates of posterior distributions , 2000 .
[26] V. Vovk. Competitive On‐line Statistics , 2001 .
[27] Shahar Mendelson,et al. Agnostic Learning Nonconvex Function Classes , 2002, COLT.
[28] Mathukumalli Vidyasagar,et al. Learning and Generalization: With Applications to Neural Networks , 2002 .
[29] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .
[30] Mark Braverman,et al. Learnability and automatizability , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.
[31] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[32] A. Dawid,et al. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.
[33] Tong Zhang. From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.
[34] P. Bartlett,et al. Empirical minimization , 2006 .
[35] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[36] Tong Zhang,et al. Information-theoretic upper and lower bounds for statistical estimation , 2006, IEEE Transactions on Information Theory.
[37] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.
[38] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[39] A. V. D. Vaart,et al. Misspecification in infinite-dimensional Bayesian statistics , 2006, math/0607023.
[40] O. Catoni. PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning , 2007, 0712.0248.
[41] Jean-Yves Audibert,et al. Progressive mixture rules are deviation suboptimal , 2007, NIPS.
[42] Y. Singer,et al. Logarithmic Regret Algorithms for Strongly Convex Repeated Games , 2007 .
[43] P. Massart,et al. Risk bounds for statistical learning , 2007, math/0702683.
[44] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[45] Peter L. Bartlett,et al. Adaptive Online Gradient Descent , 2007, NIPS.
[46] P. Grünwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .
[47] A. Juditsky,et al. Learning by mirror averaging , 2005, math/0511468.
[48] Shahar Mendelson,et al. Lower Bounds for the Empirical Minimization Algorithm , 2008, IEEE Transactions on Information Theory.
[49] J. Rissanen,et al. That Simple Device Already Used by Gauss , 2008 .
[50] Vladimir Vovk,et al. Prediction with expert advice for the Brier game , 2007, ICML '08.
[51] Shahar Mendelson,et al. Obtaining fast error rates in nonconvex situations , 2008, J. Complex..
[52] Jean-Yves Audibert. Fast learning rates in statistical inference through aggregation , 2007, math/0703854.
[53] Vladimir Vovk,et al. Supermartingales in prediction with expert advice , 2008, Theor. Comput. Sci..
[54] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[55] Peter Grünwald,et al. Safe Learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity , 2011, COLT.
[56] P. Bartlett,et al. Margin-adaptive model selection in statistical learning , 2008, 0804.2937.
[57] Mark D. Reid,et al. Mixability is Bayes Risk Curvature Relative to Log Loss , 2011, COLT.
[58] R. Bass. Convergence of probability measures , 2011 .
[59] Shai Ben-David,et al. Multiclass Learnability and the ERM principle , 2011, COLT.
[60] Guillaume Lecué. Interplay between concentration, complexity and geometry in learning theory with applications to high dimensional data analysis , 2011 .
[61] Mark D. Reid,et al. Mixability in Statistical Learning , 2012, NIPS.
[62] Peter Grünwald,et al. The Safe Bayesian - Learning the Learning Rate via the Mixability Gap , 2012, ALT.
[63] Arnak S. Dalalyan,et al. Mirror averaging with sparsity priors , 2010, 1003.1189.
[64] S. Walker,et al. Bayesian asymptotics with misspecified models , 2013 .
[65] R. Ramamoorthi,et al. On Posterior Concentration in Misspecified Models , 2013, 1312.4620.
[66] Tim van Erven,et al. From Exp-concavity to Mixability , 2013 .
[67] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.
[68] Robert C. Williamson,et al. From Stochastic Mixability to Fast Rates , 2014, NIPS.
[69] Shai Ben-David,et al. The sample complexity of agnostic learning under deterministic labels , 2014, COLT.
[70] Wouter M. Koolen,et al. Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..
[71] Shahar Mendelson,et al. Learning without Concentration , 2014, COLT.
[72] Wouter M. Koolen,et al. Learning the Learning Rate for Prediction with Expert Advice , 2014, NIPS.
[73] Thijs van Ommen,et al. Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It , 2014, 1412.3730.
[74] Xinhua Zhang,et al. Exp-Concavity of Proper Composite Losses , 2015, COLT.
[75] Mark D. Reid,et al. Composite Multiclass Losses , 2011, J. Mach. Learn. Res..