High-Probability Risk Bounds via Sequential Predictors
暂无分享,去创建一个
[1] Tong Zhang. Mathematical Analysis of Machine Learning Algorithms , 2023 .
[2] Yeshwanth Cherapanamjeri,et al. Optimal PAC Bounds Without Uniform Convergence , 2023, ArXiv.
[3] Nikita Zhivotovskiy,et al. Exploring Local Norms in Exp-concave Statistical Learning , 2023, Annual Conference Computational Learning Theory.
[4] A. Suresh,et al. Concentration Bounds for Discrete Distribution Estimation in KL Divergence , 2023, 2023 IEEE International Symposium on Information Theory (ISIT).
[5] G. Blanchard,et al. Constant regret for sequence prediction with limited advice , 2022, ALT.
[6] Dirk van der Hoeven,et al. A Regret-Variance Trade-Off in Online Learning , 2022, NeurIPS.
[7] R. Agrawal. Finite-sample concentration of the empirical relative entropy around its mean , 2022, ArXiv.
[8] Varun Kanade,et al. Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition , 2022, ArXiv.
[9] Alessandro Rudi,et al. Mixability made efficient: Fast online multiclass logistic regression , 2021, NeurIPS.
[10] Julian Zimmert,et al. Efficient Methods for Online Multiclass Logistic Regression , 2021, ALT.
[11] Daniel M. Roy,et al. Minimax rates for conditional density estimation via empirical entropy , 2021, The Annals of Statistics.
[12] Suhas Vijaykumar,et al. Localization, Convexity, and Star Aggregation , 2021, NeurIPS.
[13] Nikita Zhivotovskiy,et al. Distribution-Free Robust Linear Regression , 2021, Mathematical Statistics and Learning.
[14] Wouter M. Koolen,et al. MetaGrad: Adaptation using Multiple Learning Rates in Online Learning , 2021, J. Mach. Learn. Res..
[15] Nikita Zhivotovskiy,et al. Exponential Savings in Agnostic Active Learning Through Abstention , 2021, IEEE Transactions on Information Theory.
[16] N. V. Vinodchandran,et al. Near-optimal learning of tree-structured distributions by Chow-Liu , 2020, STOC.
[17] Nikita Zhivotovskiy,et al. Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors , 2020, Bernoulli.
[18] R. Khardon,et al. Pseudo-Bayesian Learning via Direct Loss Minimization with Applications to Sparse Gaussian Process Models , 2020, AABI.
[19] P. Gaillard,et al. Efficient improper learning for online logistic regression , 2020, COLT.
[20] Jaouad Mourtada,et al. An improper estimator with optimal excess risk in misspecified density estimation and logistic regression , 2019, J. Mach. Learn. Res..
[21] Shahar Mendelson,et al. An Unrestricted Learning Procedure , 2019, J. ACM.
[22] O. Bousquet,et al. Fast classification rates without standard margin assumptions , 2019, Information and Inference: A Journal of the IMA.
[23] Nicholas J. A. Harvey,et al. Tight Analyses for Non-Smooth Stochastic Gradient Descent , 2018, COLT.
[24] Olivier Wintenberger,et al. Efficient online algorithms for fast-rate regret bounds under sparsity , 2018, NeurIPS.
[25] Haipeng Luo,et al. Logistic Regression: The Importance of Being Improper , 2018, COLT.
[26] Wojciech Kotlowski,et al. The Many Faces of Exponential Weights in Online Learning , 2018, COLT.
[27] Nishant Mehta,et al. Fast rates with high probability in exp-concave statistical learning , 2016, AISTATS.
[28] Alon Orlitsky,et al. On Learning Distributions from their Samples , 2015, COLT.
[29] S. Mendelson. On aggregation for heavy-tailed classes , 2015, Probability Theory and Related Fields.
[30] Karthik Sridharan,et al. Learning with Square Loss: Localization through Offset Rademacher Complexity , 2015, COLT.
[31] Ohad Shamir,et al. The sample complexity of learning linear predictors with the squared loss , 2014, J. Mach. Learn. Res..
[32] Elad Hazan,et al. Logistic Regression: Tight Bounds for Stochastic and Online Optimization , 2014, COLT.
[33] Olivier Wintenberger,et al. Optimal learning with Bernstein online aggregation , 2014, Machine Learning.
[34] Koby Crammer,et al. A generalized online mirror descent with applications to classification and regression , 2013, Machine Learning.
[35] P. Rigollet,et al. Optimal learning with Q-aggregation , 2013, 1301.6080.
[36] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[37] Wojciech Kotlowski,et al. Bounds on Individual Risk for Log-loss Predictors , 2011, COLT.
[38] Thomas M. Cover,et al. Open Problems in Communication and Computation , 2011, Springer New York.
[39] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.
[40] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[41] S. Mendelson,et al. Aggregation via empirical risk minimization , 2009 .
[42] Ambuj Tewari,et al. On the Generalization Ability of Online Strongly Convex Programming Algorithms , 2008, NIPS.
[43] Jean-Yves Audibert,et al. Progressive mixture rules are deviation suboptimal , 2007, NIPS.
[44] Jean-Yves Audibert. Fast learning rates in statistical inference through aggregation , 2007, math/0703854.
[45] Tong Zhang. From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.
[46] Tong Zhang,et al. Information-theoretic upper and lower bounds for statistical estimation , 2006, IEEE Transactions on Information Theory.
[47] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[48] A. Juditsky,et al. Learning by mirror averaging , 2005, math/0511468.
[49] P. Bartlett,et al. Local Rademacher complexities , 2005, math/0508275.
[50] Sham M. Kakade,et al. Online Bounds for Bayesian Algorithms , 2004, NIPS.
[51] Dietrich Braess,et al. Bernstein polynomials and learning theory , 2004, J. Approx. Theory.
[52] Liam Paninski,et al. Estimation of Entropy and Mutual Information , 2003, Neural Computation.
[53] V. Vovk. Competitive On‐line Statistics , 2001 .
[54] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[55] Manfred K. Warmuth,et al. Relative Expected Instantaneous Loss Bounds , 2000, J. Comput. Syst. Sci..
[56] Yuhong Yang. Mixing Strategies for Density Estimation , 2000 .
[57] Nicolò Cesa-Bianchi,et al. Worst-Case Bounds for the Logarithmic Loss of Predictors , 1999, Machine Learning.
[58] Yuhong Yang,et al. Information-theoretic determination of minimax rates of convergence , 1999 .
[59] Jürgen Forster,et al. On Relative Loss Bounds in Generalized Linear Regression , 1999, FCT.
[60] Manfred K. Warmuth,et al. Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.
[61] Neri Merhav,et al. Universal Prediction , 1998, IEEE Trans. Inf. Theory.
[62] Andrew R. Barron,et al. Minimax redundancy for the class of memoryless sources , 1997, IEEE Trans. Inf. Theory.
[63] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[64] Nick Littlestone,et al. From on-line to batch learning , 1989, COLT '89.
[65] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[66] Raphail E. Krichevsky,et al. The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.
[67] J. Rissanen. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[68] Olivier Catoni,et al. Statistical learning theory and stochastic optimization , 2004 .
[69] Alexandre B. Tsybakov,et al. Optimal Rates of Aggregation , 2003, COLT.
[70] Arkadi Nemirovski,et al. Topics in Non-Parametric Statistics , 2000 .
[71] O. Catoni. The Mixture Approach to Universal Model Selection , 1997 .
[72] Jorma Rissanen,et al. Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.
[73] A. Barron. Are Bayes Rules Consistent in Information , 1987 .
[74] Shun-ichi Amari,et al. A Theory of Pattern Recognition , 1968 .
[75] M. Aizerman,et al. Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .
[76] Albert B Novikoff,et al. ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .