暂无分享,去创建一个
[1] E. L. Lehmann,et al. Theory of point estimation , 1950 .
[2] Andrew R. Barron,et al. Asymptotic minimax regret for data compression, gambling, and prediction , 1997, IEEE Trans. Inf. Theory.
[3] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[4] Anja Vogler,et al. An Introduction to Multivariate Statistical Analysis , 2004 .
[5] G. D. Murray,et al. NOTE ON ESTIMATION OF PROBABILITY DENSITY FUNCTIONS , 1977 .
[6] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[7] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[8] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.
[9] Dmitrii Ostrovskii,et al. Finite-sample Analysis of M-estimators using Self-concordance , 2018, 1810.06838.
[10] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[11] H. Robbins. A Stochastic Approximation Method , 1951 .
[12] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[13] Ambuj Tewari,et al. On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.
[14] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[15] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[16] Haipeng Luo,et al. Logistic Regression: The Importance of Being Improper , 2018, COLT.
[17] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .
[18] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.
[19] Peter Grünwald,et al. A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity , 2017, ALT.
[20] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[21] Feng Liang,et al. Improved minimax predictive densities under Kullback-Leibler loss , 2006 .
[22] Nicolas Macris,et al. Optimal errors and phase transitions in high-dimensional generalized linear models , 2017, Proceedings of the National Academy of Sciences.
[23] Elad Hazan,et al. Logistic Regression: Tight Bounds for Stochastic and Online Optimization , 2014, COLT.
[24] Pierre Gaillard,et al. A Chaining Algorithm for Online Nonparametric Regression , 2015, COLT.
[25] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[26] Francis R. Bach,et al. Self-concordant analysis for logistic regression , 2009, ArXiv.
[27] Larry Wasserman,et al. All of Nonparametric Statistics (Springer Texts in Statistics) , 2006 .
[28] Wojciech Kotlowski,et al. Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation , 2011, COLT.
[29] J. Rissanen,et al. ON SEQUENTIALLY NORMALIZED MAXIMUM LIKELIHOOD MODELS , 2008 .
[30] Vee Ming Ng,et al. On the estimation of parametric density functions , 1980 .
[31] P. Massart,et al. Minimum contrast estimators on sieves: exponential bounds and rates of convergence , 1998 .
[32] Ambuj Tewari,et al. Smoothness, Low Noise and Fast Rates , 2010, NIPS.
[33] Stergios B. Fotopoulos,et al. All of Nonparametric Statistics , 2007, Technometrics.
[34] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[35] D. Freedman,et al. How Many Variables Should Be Entered in a Regression Equation , 1983 .
[36] J. Picard,et al. Statistical learning theory and stochastic optimization : École d'eté de probabilités de Saint-Flour XXXI - 2001 , 2004 .
[37] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.
[38] Malay Ghosh,et al. Nonsubjective priors via predictive relative entropy regret , 2006 .
[39] V. Spokoiny. Parametric estimation. Finite sample theory , 2011, 1111.3029.
[40] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[41] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[42] J. Berkson. Application of the Logistic Function to Bio-Assay , 1944 .
[43] R. Bhatia. Positive Definite Matrices , 2007 .
[44] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[45] Ron Meir,et al. Generalization Error Bounds for Bayesian Mixture Algorithms , 2003, J. Mach. Learn. Res..
[46] Kfir Y. Levy,et al. Fast Rates for Exp-concave Empirical Risk Minimization , 2015, NIPS.
[47] A. Barron,et al. Jeffreys' prior is asymptotically least favorable under entropy risk , 1994 .
[48] T. Poggio,et al. STABILITY RESULTS IN LEARNING THEORY , 2005 .
[49] Yurii Nesterov,et al. Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.
[50] Matthew J. Streeter,et al. Open Problem: Better Bounds for Online Logistic Regression , 2012, COLT.
[51] F. Komaki. On asymptotic properties of predictive distributions , 1996 .
[52] H. White. Maximum Likelihood Estimation of Misspecified Models , 1982 .
[53] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[54] Alessandro Rudi,et al. Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses , 2019, NeurIPS.
[55] Sham M. Kakade,et al. Online Bounds for Bayesian Algorithms , 2004, NIPS.
[56] R. Z. Khasʹminskiĭ,et al. Statistical estimation : asymptotic theory , 1981 .
[57] J. Hájek. Local asymptotic minimax and admissibility in estimation , 1972 .
[58] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[59] Feng Liang,et al. Exact minimax strategies for predictive density estimation, data compression, and model selection , 2002, IEEE Transactions on Information Theory.
[60] Y. Baraud,et al. Rho-estimators revisited: General theory and applications , 2016, The Annals of Statistics.
[61] Ian R. Harris. Predictive fit for natural exponential families , 1989 .
[62] L. L. Cam,et al. Asymptotic Methods In Statistical Decision Theory , 1986 .
[63] S. R. Jammalamadaka,et al. Empirical Processes in M-Estimation , 2001 .
[64] A. Barron. Are Bayes Rules Consistent in Information , 1987 .
[65] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[66] G. Wahba. Spline models for observational data , 1990 .
[67] Francis R. Bach,et al. Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression , 2013, J. Mach. Learn. Res..
[68] Jean-Yves Audibert. Fast learning rates in statistical inference through aggregation , 2007, math/0703854.
[69] Yuhong Yang. Mixing Strategies for Density Estimation , 2000 .
[70] Wojciech Kotlowski,et al. Bounds on Individual Risk for Log-loss Predictors , 2011, COLT.
[71] Jaouad Mourtada. Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices , 2019 .
[72] Jean-Yves Audibert,et al. Progressive mixture rules are deviation suboptimal , 2007, NIPS.
[73] M. Talagrand. Upper and Lower Bounds for Stochastic Processes: Modern Methods and Classical Problems , 2014 .
[74] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[75] E. Candès,et al. The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression , 2018, The Annals of Statistics.
[76] Neri Merhav,et al. Universal Prediction , 1998, IEEE Trans. Inf. Theory.
[77] A. Juditsky,et al. Learning by mirror averaging , 2005, math/0511468.
[78] Alessandro Rudi,et al. Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance , 2019, COLT.
[79] T. N. Sriram. Asymptotics in Statistics–Some Basic Concepts , 2002 .
[80] Shai Shalev-Shwartz,et al. Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization , 2016, J. Mach. Learn. Res..
[81] Jorma Rissanen,et al. Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.
[82] Yuhong Yang,et al. An Asymptotic Property of Model Selection Criteria , 1998, IEEE Trans. Inf. Theory.
[83] Nishant Mehta,et al. Fast rates with high probability in exp-concave statistical learning , 2016, AISTATS.
[84] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[85] Rong Jin,et al. Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization , 2015, COLT.
[86] Yuhong Yang,et al. Information-theoretic determination of minimax rates of convergence , 1999 .
[87] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[88] Shahar Mendelson,et al. Learning without Concentration , 2014, COLT.
[89] Ohad Shamir,et al. Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..
[90] Mihaela Aslan,et al. Asymptotically minimax Bayes predictive densities , 2006, 0708.0177.
[91] Edward I. George,et al. Admissible predictive density estimation , 2008 .
[92] Jayanta K. Ghosh,et al. Higher Order Asymptotics , 1994 .
[93] W. Wong,et al. Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .
[94] Nick Littlestone,et al. From on-line to batch learning , 1989, COLT '89.
[95] Soumendu Sundar Mukherjee,et al. Weak convergence and empirical processes , 2019 .
[96] Manfred K. Warmuth,et al. The Last-Step Minimax Algorithm , 2000, ALT.
[97] V. Koltchinskii,et al. Bounding the smallest singular value of a random matrix without concentration , 2013, 1312.3580.
[98] J. Aitchison. Goodness of prediction fit , 1975 .
[99] Alessandro Rudi,et al. Efficient improper learning for online logistic regression , 2020, COLT.
[100] Nicolò Cesa-Bianchi,et al. Worst-Case Bounds for the Logarithmic Loss of Predictors , 1999, Machine Learning.
[101] L. Birge,et al. A new method for estimation and model selection:$$\rho $$ρ-estimation , 2014, 1403.6057.
[102] Marina Daecher. Open Problems In Communication And Computation , 2016 .
[103] Roberto Imbuzeiro Oliveira,et al. The lower tail of random quadratic forms with applications to ordinary least squares , 2013, ArXiv.
[104] J. Hartigan. The maximum likelihood prior , 1998 .
[105] P. Massart,et al. Rates of convergence for minimum contrast estimators , 1993 .
[106] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[107] Tong Zhang. From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.
[108] Peter Grünwald,et al. Fast Rates with Unbounded Losses , 2016, ArXiv.
[109] L. Birge,et al. Rho-estimators for shape restricted density estimation , 2016 .
[110] Nathan Srebro,et al. Fast Rates for Regularized Objectives , 2008, NIPS.
[111] S. Mendelson,et al. Performance of empirical risk minimization in linear aggregation , 2014, 1402.5763.
[112] Peter L. Bartlett,et al. Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families , 2013, COLT.
[113] V. Koltchinskii,et al. Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .
[114] Peter Grünwald,et al. Fast Rates for General Unbounded Loss Functions: From ERM to Generalized Bayes , 2016, J. Mach. Learn. Res..
[115] O. Catoni. The Mixture Approach to Universal Model Selection , 1997 .
[116] Luela Prifti,et al. ON PARAMETRIC ESTIMATION , 2015 .
[117] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.
[118] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[119] Babak Hassibi,et al. The Impact of Regularization on High-dimensional Logistic Regression , 2019, NeurIPS.
[120] E. Candès,et al. A modern maximum-likelihood theory for high-dimensional logistic regression , 2018, Proceedings of the National Academy of Sciences.
[121] Abraham Wald,et al. Statistical Decision Functions , 1951 .