On optimization methods for deep learning
暂无分享,去创建一个
Quoc V. Le | Jiquan Ngiam | Andrew Y. Ng | Adam Coates | Ahbik Lahiri | Bobby Prochnow | A. Ng | A. Coates | Jiquan Ngiam | A. Lahiri | B. Prochnow | Adam Coates
[1] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .
[2] Kevin S. Van Horn,et al. Learning as optimization , 1994 .
[3] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.
[4] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[5] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[6] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[7] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[8] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[9] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[10] Mark W. Schmidt,et al. Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.
[11] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.
[12] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[13] Thomas Hofmann,et al. Map-Reduce for Machine Learning on Multicore , 2007 .
[14] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[15] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.
[16] Alexander J. Smola,et al. A scalable modular convex solver for regularized risk minimization , 2007, KDD '07.
[17] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[18] Peter L. Bartlett,et al. Adaptive Online Gradient Descent , 2007, NIPS.
[19] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[20] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[21] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[22] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[23] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[24] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.
[25] Gideon S. Mann,et al. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.
[26] Geoffrey E. Hinton,et al. 3D Object Recognition with Deep Belief Nets , 2009, NIPS.
[27] Quoc V. Le,et al. Proximal regularization for online and batch learning , 2009, ICML '09.
[28] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..
[29] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[30] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.
[31] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[32] Quoc V. Le,et al. Tiled convolutional neural networks , 2010, NIPS.
[33] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[34] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[35] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[36] Luca Maria Gambardella,et al. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.
[37] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[38] Quoc V. Le,et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.
[39] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[40] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[41] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[42] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[43] eon BottouAT. Stochastic Gradient Learning in Neural Networks , 2022 .