Parallel Deep Neural Network Training for Big Data on Blue Gene/Q
暂无分享,去创建一个
Tara N. Sainath | Brian Kingsbury | Bhuvana Ramabhadran | Michael Picheny | John A. Gunnels | I-Hsin Chung | Upendra V. Chaudhari | Vernon Austel | M. Picheny | T. Sainath | Brian Kingsbury | B. Ramabhadran | I. Chung | V. Austel | U. Chaudhari
[1] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[2] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.
[3] Michael Lang,et al. A Performance and Scalability Analysis of the BlueGene/L Architecture , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[4] Kunle Olukotun,et al. Map-Reduce for Machine Learning on Multicore , 2006, NIPS.
[5] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[6] Sally A. McKee,et al. Predicting parallel application performance via machine learning approaches , 2007, Concurr. Comput. Pract. Exp..
[7] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[9] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[10] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[11] Dong Yu,et al. Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition , 2010 .
[12] Brian Kingsbury,et al. The IBM Attila speech recognition toolkit , 2010, 2010 IEEE Spoken Language Technology Workshop.
[13] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[14] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..
[15] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[16] Tara N. Sainath,et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[17] Viatcheslav Gurev,et al. Toward real-time modeling of human heart ventricles at cellular resolution: Simulation of drug-induced arrhythmias , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[18] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[19] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[20] Michael Gschwind,et al. Blue Gene/Q: design for sustained multi-petaflop computing , 2012, ICS '12.
[21] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.
[22] Daniel Povey,et al. Krylov Subspace Descent for Deep Learning , 2011, AISTATS.
[23] Tara N. Sainath,et al. Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization , 2012, INTERSPEECH.
[24] Michael Gschwind,et al. The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.
[25] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[26] Tara N. Sainath,et al. Accelerating Hessian-free optimization for Deep Neural Networks by implicit preconditioning and sampling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[27] Viatcheslav Gurev,et al. Science at LLNL with IBM Blue Gene/Q , 2013, IBM J. Res. Dev..
[28] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[29] George Saon,et al. A comparison of two optimization techniques for sequence discriminative training of deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).