Information Theoretic Guarantees for Empirical Risk Minimization with Applications to Model Selection and Large-Scale Optimization
暂无分享,去创建一个
[1] Aaron Roth,et al. Adaptive Learning with Robust Generalization Guarantees , 2016, COLT.
[2] A. Wald,et al. On Stochastic Limit and Order Relationships , 1943 .
[3] Nathan Srebro,et al. Fast Rates for Regularized Objectives , 2008, NIPS.
[4] Toniann Pitassi,et al. Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.
[5] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[6] Imre Csiszár,et al. Axiomatic Characterizations of Information Measures , 2008, Entropy.
[7] R. A. Silverman,et al. Introductory Real Analysis , 1972 .
[8] Ibrahim M. Alabdulmohsin. An Information-Theoretic Route from Generalization in Expectation to Generalization in Probability , 2017, AISTATS.
[9] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .
[10] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[11] I. Csiszár. A class of measures of informativity of observation channels , 1972 .
[12] Prasad Raghavendra,et al. Agnostic Learning of Monomials by Halfspaces Is Hard , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.
[13] Maxim Raginsky,et al. Information-theoretic analysis of stability and bias of learning algorithms , 2016, 2016 IEEE Information Theory Workshop (ITW).
[14] Svante Janson. PROBABILITY ASYMPTOTICS: NOTES ON NOTATION , 2009 .
[15] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.
[16] Ibrahim M. Alabdulmohsin. Algorithmic Stability and Uniform Generalization , 2015, NIPS.
[17] João Gama,et al. Kull, M., & Flach, P. A. (2015). Novel Decompositions of Proper Scoring Rules for Classification: Score Adjustment as Precursor to Calibration , 2015 .
[18] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[19] Ohad Shamir,et al. Using More Data to Speed-up Training Time , 2011, AISTATS.
[20] William W. Hager,et al. Updating the Inverse of a Matrix , 1989, SIAM Rev..
[21] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[22] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[23] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[24] Vitaly Feldman,et al. Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back , 2016, NIPS.
[25] Kfir Y. Levy,et al. Fast Rates for Exp-concave Empirical Risk Minimization , 2015, NIPS.
[26] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[27] T. Tao. Topics in Random Matrix Theory , 2012 .
[28] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .
[29] David M. Blei,et al. A Variational Analysis of Stochastic Gradient Algorithms , 2016, ICML.
[30] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.
[31] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .