Efficient Learning of CNNs using Patch Based Features
暂无分享,去创建一个
S. Shalev-Shwartz | Alon Brutzkus | A. Globerson | Eran Malach | Alon Regev Netser | Shai Shalev-Shwartz
[1] J. Zico Kolter,et al. Patches Are All You Need? , 2022, Trans. Mach. Learn. Res..
[2] Dar Gilboa,et al. Deep Networks Provably Classify Data on Curves , 2021, NeurIPS.
[3] Aravindan Vijayaraghavan,et al. Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations , 2021, Neural Information Processing Systems.
[4] Joan Bruna,et al. On the Cryptographic Hardness of Learning Single Periodic Neurons , 2021, NeurIPS.
[5] Jeff Z. HaoChen,et al. Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss , 2021, NeurIPS.
[6] Alexander Cloninger,et al. A deep network construction that adapts to intrinsic dimensionality beyond the domain , 2021, Neural Networks.
[7] A. Dosovitskiy,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[8] Eran Malach,et al. Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels , 2021, ICML.
[9] Amit Daniely,et al. From Local Pseudorandom Generators to Hardness of Learning , 2021, COLT.
[10] Edouard Oyallon,et al. The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods , 2021, ICLR.
[11] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[12] John Wright,et al. Deep Networks and the Multiple Manifold Problem , 2020, ICLR.
[13] Andrea Montanari,et al. When do neural networks outperform kernel methods? , 2020, NeurIPS.
[14] Amit Daniely,et al. Hardness of Learning Neural Networks with Natural Weights , 2020, NeurIPS.
[15] Daniel M. Kane,et al. Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks , 2020, COLT.
[16] Ingo Steinwart,et al. Adaptive learning rates for support vector machines working on data with low intrinsic dimension , 2020, The Annals of Statistics.
[17] Nathan Srebro,et al. Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity , 2020, COLT 2020.
[18] Amit Daniely,et al. Learning Parities with Neural Networks , 2020, NeurIPS.
[19] Matus Telgarsky,et al. Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks , 2019, ICLR.
[20] F. Krzakala,et al. Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model , 2019, Physical Review X.
[21] T. Zhao,et al. Nonparametric Regression on Low-Dimensional Manifolds using Deep ReLU Networks , 2019, 1908.01842.
[22] Johannes Schmidt-Hieber,et al. Deep ReLU network approximation of functions on a manifold , 2019, ArXiv.
[23] Yuan Cao,et al. Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks , 2019, NeurIPS.
[24] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[25] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[26] Matus Telgarsky,et al. Gradient descent aligns the layers of deep linear networks , 2018, ICLR.
[27] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[28] Zhize Li,et al. Learning Two-layer Neural Networks with Symmetric Inputs , 2018, ICLR.
[29] Arthur Jacot,et al. Neural Tangent Kernel: Convergence and Generalization in Neural Networks , 2018, NeurIPS.
[30] Simon S. Du,et al. Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps , 2018, ArXiv.
[31] Samet Oymak,et al. End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition , 2018, ArXiv.
[32] Shai Shalev-Shwartz,et al. A Provably Correct Algorithm for Deep Learning that Actually Works , 2018, ArXiv.
[33] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[34] Yuandong Tian,et al. Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima , 2017, ICML.
[35] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[36] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[37] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.
[38] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[39] Ronen Basri,et al. Efficient Representation of Low-Dimensional Manifolds using Deep Networks , 2016, ICLR.
[40] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[41] Shai Shalev-Shwartz,et al. K-means recovers ICA filters when independent components are sparse , 2014, ICML.
[42] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[43] Amit Daniely,et al. Complexity Theoretic Limitations on Learning DNF's , 2014, COLT.
[44] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[45] Razvan Pascanu,et al. On the number of response regions of deep feed forward networks with piece-wise linear activations , 2013, 1312.6098.
[46] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[47] David P. Williamson,et al. The Design of Approximation Algorithms , 2011 .
[48] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[49] Alexander A. Sherstov,et al. Cryptographic Hardness for Learning Intersections of Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[50] Ding-Xuan Zhou,et al. The covering number in learning theory , 2002, J. Complex..
[51] Sanjeev R. Kulkarni,et al. Covering numbers for real-valued function classes , 1997, IEEE Trans. Inf. Theory.
[52] David Haussler,et al. Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.
[53] Kenneth Falconer,et al. Fractal Geometry: Mathematical Foundations and Applications , 1990 .
[54] Alon Brutzkus,et al. An optimization and generalization analysis for max-pooling networks , 2021, UAI.
[55] AI Koan,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.
[56] Alexander A. Sherstov,et al. Cryptographic Hardness Results for Learning Intersections of Halfspaces , 2006, Electron. Colloquium Comput. Complex..
[57] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2004 .
[58] Teofilo F. GONZALEZ,et al. Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..