Breaking the Curse of Dimensionality with Convex Neural Networks
暂无分享,去创建一个
[1] H. Whitney. Analytic Extensions of Differentiable Functions Defined in Closed Sets , 1934 .
[2] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[3] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[4] G. Forsythe,et al. On the Stationary Values of a Second-Degree Polynomial on the Unit Sphere , 1965 .
[5] R. Schneider. Zu einem Problem von Shephard über die Projektionen konvexer Körper , 1967 .
[6] V. F. Dem'yanov,et al. The Minimization of a Smooth Convex Functional on a Convex Set , 1967 .
[7] E. Bolker. A class of convex bodies , 1969 .
[8] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[9] J. Dunn,et al. Conditional gradient algorithms with open loop step size rules , 1978 .
[10] J. Friedman,et al. Projection Pursuit Regression , 1981 .
[12] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[13] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .
[14] W. Rudin. Real and complex analysis, 3rd ed. , 1987 .
[15] Herbert Edelsbrunner,et al. Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.
[16] J. Lindenstrauss,et al. Approximation of zonoids by zonotopes , 1989 .
[17] R. DeVore,et al. Optimal nonlinear approximation , 1989 .
[18] Ker-Chau Li,et al. Sliced Inverse Regression for Dimension Reduction , 1991 .
[19] L. Evans. Measure theory and fine properties of functions , 1992 .
[20] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[21] Leo Breiman,et al. Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.
[22] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[23] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[24] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[25] J. Matousek,et al. Improved upper bounds for approximation by zonotopes , 1996 .
[26] Geoffrey E. Hinton,et al. Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[27] Y. Nesterov. Semidefinite relaxation and nonconvex quadratic optimization , 1998 .
[28] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[29] Y. Makovoz. Uniform Approximation by Neural Networks , 1998 .
[30] P. Petrushev. Approximation by ridge functions and neural networks , 1999 .
[31] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.
[32] Alexander J. Smola,et al. Regularization with Dot-Product Kernels , 2000, NIPS.
[33] Ron Meir,et al. On the near optimality of the stochastic approximation of smooth functions by neural networks , 2000, Adv. Comput. Math..
[34] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[35] Marcello Sanguineti,et al. Bounds on rates of variable-basis and neural-network approximation , 2001, IEEE Trans. Inf. Theory.
[36] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.
[37] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[38] Martin Burger,et al. Error Bounds for Approximation with Neural Networks , 2001, J. Approx. Theory.
[39] Alexander Barvinok,et al. A course in convexity , 2002, Graduate studies in mathematics.
[40] Chong Gu. Smoothing Spline Anova Models , 2002 .
[41] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[42] Leonidas J. Guibas,et al. Zonotopes as bounding volumes , 2003, SODA '03.
[43] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[44] Ulrike von Luxburg,et al. Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..
[45] Hrushikesh Narhar Mhaskar,et al. On the tractability of multivariate integration and approximation by neural networks , 2004, J. Complex..
[46] A. Berlinet,et al. Reproducing kernel Hilbert spaces in probability and statistics , 2004 .
[47] Michael I. Jordan,et al. Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004 .
[48] Ronald,et al. Learning representations by backpropagating errors , 2004 .
[49] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[50] Ji Zhu,et al. Boosting as a Regularized Path to a Maximum Margin Classifier , 2004, J. Mach. Learn. Res..
[51] Baver Okutmustur. Reproducing kernel Hilbert spaces , 2005 .
[52] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.
[53] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..
[54] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[55] Prasad Raghavendra,et al. Hardness of Learning Halfspaces with Noise , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[56] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[57] Hrushikesh Narhar Mhaskar,et al. Weighted quadrature formulas and approximation by zonal function networks on the sphere , 2006, J. Complex..
[58] Vitaly Maiorov,et al. Approximation by neural networks and learning theory , 2006, J. Complex..
[59] Alexander A. Sherstov,et al. Cryptographic Hardness for Learning Intersections of Halfspaces , 2006, FOCS.
[60] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[61] Hao Helen Zhang,et al. Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.
[62] Ji Zhu,et al. l1 Regularization in Infinite Dimensional Feature Spaces , 2007, COLT.
[63] Robert E. Mahony,et al. Optimization Algorithms on Matrix Manifolds , 2007 .
[64] Nathan Srebro,et al. ` 1 Regularization in Infinite Dimensional Feature Spaces , 2007 .
[65] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[66] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[67] Nicolas Le Roux,et al. Continuous Neural Networks , 2007, AISTATS.
[68] Larry A. Wasserman,et al. SpAM: Sparse Additive Models , 2007, NIPS.
[69] Francis R. Bach,et al. Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..
[70] Arnak S. Dalalyan,et al. A New Algorithm for Estimating the Effective Dimension-Reduction Subspace , 2008, J. Mach. Learn. Res..
[71] Ambuj Tewari,et al. On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.
[72] Francis R. Bach,et al. Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.
[73] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[74] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[75] K. Böröczky. About projection bodies , 2011 .
[76] R. Cooke. Real and Complex Analysis , 2011 .
[77] Sara van de Geer,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .
[78] Yaoliang Yu,et al. Accelerated Training for Matrix-norm Regularization: A Boosting Approach , 2012, NIPS.
[79] K. Atkinson,et al. Spherical Harmonics and Approximations on the Unit Sphere: An Introduction , 2012 .
[80] Zaïd Harchaoui,et al. Lifted coordinate descent for learning with trace-norm regularization , 2012, AISTATS.
[81] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[82] Karthik Sridharan,et al. Learning From An Optimization Viewpoint , 2012, ArXiv.
[83] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[84] Guanghui Lan. The Complexity of Large-scale Convex Programming under a Linear Optimization Oracle , 2013, 1309.5550.
[85] Francis R. Bach,et al. Convex relaxations of structured matrix factorizations , 2013, ArXiv.
[86] C. Frye,et al. Spherical Harmonics in p Dimensions , 2012, 1205.3548.
[87] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[88] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[89] R. Tibshirani,et al. Generalized Additive Models , 1986 .
[90] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[91] Stefan König,et al. Computational aspects of the Hausdorff distance in unbounded dimension , 2014, J. Comput. Geom..
[92] Francis R. Bach,et al. On the Equivalence between Quadrature Rules and Random Features , 2015, ArXiv.
[93] Francis R. Bach,et al. Duality Between Subgradient and Conditional Gradient Methods , 2012, SIAM J. Optim..
[94] Zaïd Harchaoui,et al. Conditional gradient algorithms for norm-regularized smooth convex optimization , 2013, Math. Program..
[95] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[96] L. Rosasco,et al. Reproducing kernel Hilbert spaces , 2019, High-Dimensional Statistics.