Risk of penalized least squares, greedy selection andl 1-penalization for flexible function libraries
暂无分享,去创建一个
[1] H. Akaike. Fitting autoregressive models for prediction , 1969 .
[2] Toby Berger,et al. Rate distortion theory : a mathematical basis for data compression , 1971 .
[3] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .
[4] R. Tapia,et al. Nonparametric Maximum Likelihood Estimation of Probability Densities by Penalty Function Methods , 1975 .
[5] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[6] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..
[7] I. Good,et al. Density Estimation and Bump-Hunting by the Penalized Likelihood Method Exemplified by Scattering and Meteorite Data , 1980 .
[8] R. Shibata. An optimal selection of regression variables , 1981 .
[9] J. Friedman,et al. Projection Pursuit Regression , 1981 .
[10] B. Silverman,et al. On the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method , 1982 .
[11] J. Rissanen. A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .
[12] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[13] Robin Sibson,et al. What is projection pursuit , 1987 .
[14] Ker-Chau Li,et al. Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set , 1987 .
[15] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[16] G. Wahba. Spline models for observational data , 1990 .
[17] D. Cox,et al. Asymptotic Analysis of Penalized Likelihood and Related Estimators , 1990 .
[18] A. Barron,et al. Discussion: Multivariate Adaptive Regression Splines , 1991 .
[19] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[20] A. Barron. Approximation and Estimation Bounds for Artificial Neural Networks , 1991, COLT '91.
[21] J H Frieadman. MULTIVARIATE ADDITIVE REGRESSION SPLINES , 1991 .
[22] Andrew R. Barron,et al. Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.
[23] L. Jones. A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .
[24] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[25] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..
[26] P. Massart,et al. Rates of convergence for minimum contrast estimators , 1993 .
[27] D. Donoho,et al. Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.
[28] David Haussler,et al. Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.
[29] C. Mallows. More comments on C p , 1995 .
[30] Y. Makovoz. Random Approximants and Neural Networks , 1996 .
[31] Dennis D. Cox,et al. Penalized Likelihood-type Estimators for Generalized Nonparametric Regression , 1996 .
[32] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[33] Ronald A. DeVore,et al. Some remarks on greedy algorithms , 1996, Adv. Comput. Math..
[34] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[35] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[36] Tj Sweeting,et al. Invited discussion of A. R. Barron: Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems , 1998 .
[37] Jorma Rissanen,et al. The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.
[38] Xiaotong Shen. ON THE METHOD OF PENALIZATION , 1998 .
[39] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..
[40] Yuhong Yang,et al. An Asymptotic Property of Model Selection Criteria , 1998, IEEE Trans. Inf. Theory.
[41] P. Massart,et al. Minimum contrast estimators on sieves: exponential bounds and rates of convergence , 1998 .
[42] P. Massart,et al. Risk bounds for model selection via penalization , 1999 .
[43] Yuhong Yang,et al. Information-theoretic determination of minimax rates of convergence , 1999 .
[44] Andrew R. Barron,et al. Estimation with two hidden layer neural nets , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).
[45] E. Candès,et al. Ridgelets: a key to higher-dimensional intermittency? , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.
[46] Arkadi Nemirovski,et al. Topics in Non-Parametric Statistics , 2000 .
[47] Y. Baraud. Model selection for regression on a fixed design , 2000 .
[48] A. Juditsky,et al. Functional aggregation for nonparametric regression , 2000 .
[49] Model Selection In Non-Parametric Regression , 2000 .
[50] M. R. Osborne,et al. On the LASSO and its Dual , 2000 .
[51] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[52] Colin L. Mallows,et al. Some Comments on Cp , 2000, Technometrics.
[53] Yuhong Yang. Combining Different Procedures for Adaptive Regression , 2000, Journal of Multivariate Analysis.
[54] Andrew R. Barron,et al. Penalized least squares, model selection, convex hull classes and neural nets , 2001, ESANN.
[55] D. Donoho,et al. Atomic Decomposition by Basis Pursuit , 2001 .
[56] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[57] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .
[58] P. Massart,et al. Gaussian model selection , 2001 .
[59] V. Temlyakov,et al. Two Lower Estimates in Greedy Approximation , 2001 .
[60] Y. Baraud. Model selection for regression on a random design , 2002 .
[61] J. Friedman. Stochastic gradient boosting , 2002 .
[62] Jerome H. Friedman,et al. Tutorial: Getting Started with MART in R , 2002 .
[63] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[64] Alexandre B. Tsybakov,et al. Optimal Rates of Aggregation , 2003, COLT.
[65] Olivier Catoni,et al. Statistical learning theory and stochastic optimization , 2004 .
[66] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.
[67] J. Picard,et al. Statistical learning theory and stochastic optimization : École d'eté de probabilités de Saint-Flour XXXI - 2001 , 2004 .
[68] Mark J van der Laan,et al. Deletion/Substitution/Addition Algorithm in Learning with Applications in Genomics , 2004, Statistical applications in genetics and molecular biology.
[69] Andrew R. Barron,et al. Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.
[70] Yuhong Yang. Aggregating regression procedures to improve performance , 2004 .
[71] E. Candès,et al. New tight frames of curvelets and optimal representations of objects with piecewise C2 singularities , 2004 .
[72] E. Livshits,et al. Rate of Convergence of Pure Greedy Algorithms , 2004 .
[73] Bin Yu,et al. Boosting with early stopping: Convergence and consistency , 2005, math/0508276.
[74] V. Koltchinskii,et al. Complexities of convex combinations and bounding the generalization error in classification , 2004, math/0405356.
[75] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[76] Peng Zhao,et al. On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..
[77] Florentina Bunea,et al. Sparse Density Estimation with l1 Penalties , 2007, COLT.
[78] A. Tsybakov,et al. Aggregation for Gaussian regression , 2007, 0710.3654.
[79] R. Tibshirani,et al. PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.
[80] A. Barron,et al. Adaptive Annealing , 2008 .
[81] A. Barron,et al. Approximation and learning by greedy algorithms , 2008, 0803.1718.
[82] Tong Zhang. Some sharp performance bounds for least squares regression with L1 regularization , 2009, 0908.2869.
[83] W. Silverman. BY THE MAXIMUM PENALIZED LIKELIHOOD METHOD , .
[84] R. A. Gaskins,et al. Nonparametric roughness penalties for probability densities , 2022 .