暂无分享,去创建一个
[1] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[2] Charles R. Qi,et al. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks , 2018, ICML.
[3] Russell Reed,et al. Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.
[4] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Khaled Shaalan,et al. Speech Recognition Using Deep Neural Networks: A Systematic Review , 2019, IEEE Access.
[6] Nikhil R. Devanur,et al. PipeDream: generalized pipeline parallelism for DNN training , 2019, SOSP.
[7] Minsik Cho,et al. Data-parallel distributed training of very large models beyond GPU capacity , 2018, ArXiv.
[8] Xin Jia. Image recognition method based on deep learning , 2017, 2017 29th Chinese Control And Decision Conference (CCDC).
[9] Mattan Erez,et al. PruneTrain: fast neural network training by dynamic sparse model reconfiguration , 2019, SC.
[10] Kilian Q. Weinberger,et al. Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.
[11] W. Marsden. I and J , 2012 .
[12] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[13] Razvan Pascanu,et al. Sobolev Training for Neural Networks , 2017, NIPS.
[14] Zheng Xu,et al. Training Neural Networks Without Gradients: A Scalable ADMM Approach , 2016, ICML.
[15] Bin Gu,et al. Training Neural Networks Using Features Replay , 2018, NeurIPS.
[16] Santanu Chaudhury,et al. A Data and Model-Parallel, Distributed and Scalable Framework for Training of Deep Networks in Apache Spark , 2017, ArXiv.
[17] Kurt Keutzer,et al. Integrated Model, Batch, and Domain Parallelism in Training Neural Networks , 2017, SPAA.
[18] Mehryar Mohri,et al. AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.
[19] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[20] Gregory Shakhnarovich,et al. FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.
[21] Ryota Tomioka,et al. AMPNet: Asynchronous Model-Parallel Training for Dynamic Neural Networks , 2017, ArXiv.
[22] Miguel Á. Carreira-Perpiñán,et al. Distributed optimization of deeply nested systems , 2012, AISTATS.
[23] R. Levy. An Integrated Model , 2016 .
[24] Torsten Hoefler,et al. Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. , 2018 .
[25] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[26] Hojung Lee,et al. Local Critic Training of Deep Neural Networks , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).
[27] YeungDit-Yan,et al. Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997 .
[28] Puneet Gupta,et al. Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training , 2019, IEEE Micro.
[29] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[30] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[32] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[33] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[34] Trevor Darrell,et al. Learning the Structure of Deep Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[35] James Cheng,et al. TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism , 2020, IEEE Transactions on Parallel and Distributed Systems.
[36] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[37] Ling Shao,et al. Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Shashi Pal Singh,et al. Machine translation using deep learning: An overview , 2017, 2017 International Conference on Computer, Communications and Electronics (Comptelix).
[39] Jascha Sohl-Dickstein,et al. Measuring the Effects of Data Parallelism on Neural Network Training , 2018, J. Mach. Learn. Res..
[40] Ling Shao,et al. Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[41] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[42] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Alexander Aiken,et al. Beyond Data and Model Parallelism for Deep Neural Networks , 2018, SysML.
[44] Jiajun Zhang,et al. Deep Neural Networks in Machine Translation: An Overview , 2015, IEEE Intelligent Systems.
[45] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[46] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[47] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.
[48] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[49] James T. Kwok,et al. Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.
[50] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[51] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Dustin Tran,et al. Mesh-TensorFlow: Deep Learning for Supercomputers , 2018, NeurIPS.