On Symmetry and Initialization for Neural Networks
暂无分享,去创建一个
[1] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[2] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[3] Ohad Shamir,et al. The Power of Depth for Feedforward Neural Networks , 2015, COLT.
[4] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[5] Meng Yang,et al. Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.
[6] J. Håstad. Computational limitations of small-depth circuits , 1987 .
[7] Marvin Minsky,et al. Perceptrons: expanded edition , 1988 .
[8] Hossein Mobahi,et al. Large Margin Deep Networks for Classification , 2018, NeurIPS.
[9] Manfred K. Warmuth,et al. Relating Data Compression and Learnability , 2003 .
[10] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[11] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[12] Guillermo Sapiro,et al. Robust Large Margin Deep Neural Networks , 2016, IEEE Transactions on Signal Processing.
[13] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[14] Alexandr Andoni,et al. Learning Polynomials with Neural Networks , 2014, ICML.
[15] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[16] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[17] Amit Daniely,et al. SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.
[18] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[19] Albert B Novikoff,et al. ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .
[20] Michael Sipser,et al. Parity, circuits, and the polynomial-time hierarchy , 1981, 22nd Annual Symposium on Foundations of Computer Science (sfcs 1981).
[21] Matus Telgarsky,et al. Representation Benefits of Deep Feedforward Networks , 2015, ArXiv.
[22] Samy Bengio,et al. Links between perceptrons, MLPs and SVMs , 2004, ICML.
[23] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[24] Raman Arora,et al. Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..
[25] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[26] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[27] Marat Z. Arslanov,et al. N-bit Parity Neural Networks with minimum number of threshold neurons , 2016 .
[28] E. Romero,et al. Maximizing the margin with feedforward neural networks , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[29] Bogdan M. Wilamowski,et al. Solving parity-N problems with feedforward neural networks , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..
[30] Pedro M. Domingos,et al. Deep Symmetry Networks , 2014, NIPS.
[31] Tie-Yan Liu,et al. Large Margin Deep Neural Networks: Theory and Algorithms , 2015, ArXiv.
[32] Guillermo Sapiro,et al. Margin Preservation of Deep Neural Networks , 2016, ArXiv.
[33] M. Z. Arslanov,et al. N-bit parity ordered neural networks , 2002, Neurocomputing.
[34] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[35] Miklós Ajtai,et al. ∑11-Formulae on finite structures , 1983, Ann. Pure Appl. Log..
[36] Le Song,et al. On the Complexity of Learning Neural Networks , 2017, NIPS.
[37] Kaoru Hirota,et al. A Solution for the N-bit Parity Problem Using a Single Translated Multiplicative Neuron , 2004, Neural Processing Letters.