Large Scale Distributed Deep Networks
暂无分享,去创建一个
Marc'Aurelio Ranzato | Quoc V. Le | Andrew Y. Ng | Jeffrey Dean | Ke Yang | Gregory S. Corrado | Andrew W. Senior | Kai Chen | Rajat Monga | Mark Z. Mao | Matthieu Devin | Paul A. Tucker | A. Ng | Marc'Aurelio Ranzato | J. Dean | G. Corrado | Rajat Monga | Kai Chen | M. Devin | A. Senior | P. Tucker | Ke Yang | R. Monga | Matthieu Devin | M. Ranzato
[1] L. Bottou. Stochastic Gradient Learning in Neural Networks , 1991 .
[2] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[3] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[4] Alexander J. Smola,et al. A scalable modular convex solver for regularized risk minimization , 2007, KDD '07.
[5] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[6] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.
[7] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[8] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[9] John Langford,et al. Hash Kernels , 2009, AISTATS.
[10] Gideon S. Mann,et al. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.
[11] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[12] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[13] Gideon S. Mann,et al. Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.
[14] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[15] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[16] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[17] Luca Maria Gambardella,et al. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.
[18] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[19] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[20] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[21] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[22] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[23] John C. Duchi,et al. Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[24] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Joseph M. Hellerstein,et al. Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..
[26] Dong Yu,et al. Scalable stacking and learning for building deep architectures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[28] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[29] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[30] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] John Langford,et al. A reliable effective terascale linear learning system , 2011, J. Mach. Learn. Res..