Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster

Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. In this work, we explore how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied.

[1]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[2]  Tao Chen,et al.  Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology , 2015, ACM Multimedia.

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[5]  He Ma,et al.  Theano-MPI: A Theano-Based Distributed Training Framework , 2016, Euro-Par Workshops.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[11]  Daniel McDuff,et al.  Predicting Ad Liking and Purchase Intent: Large-Scale Analysis of Facial Responses to Ads , 2014, IEEE Transactions on Affective Computing.

[12]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[13]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[14]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[15]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[16]  Angelia Nedic,et al.  Asynchronous gossip algorithms for stochastic optimization , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Forrest N. Iandola,et al.  How to scale distributed deep learning? , 2016, ArXiv.

[19]  Yann LeCun,et al.  Deep learning with Elastic Averaging SGD , 2014, NIPS.

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  Rongrong Ji,et al.  Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.

[22]  Michael I. Jordan,et al.  SparkNet: Training Deep Networks in Spark , 2015, ICLR.

[23]  Dong Yu,et al.  1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.

[24]  Tao Chen,et al.  DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks , 2014, ArXiv.

[25]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[26]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[27]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[28]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.