论文信息 - GPU memory leveraged for accelerated training using Tensorflow

GPU memory leveraged for accelerated training using Tensorflow

Machine learning has been a detection technique used by many security vendors for some time now. With the enhancement brought by GPUs, many security products can now use different deep learning methods and forms of neural networks for malware classification. However, these new methods, as powerful as they are, are also limited by the amount of memory a GPU has or by the constant need of transferring data from CPU to GPU. As training for models used in security industry requires very large databases, consisting of millions of malicious and benign samples, security vendors had to look for ways to overcome memory constraints. This paper addresses this problem and presents some approaches that can be used when dealing with deep learning algorithms in conjunction with large databases, approaches that are adapted to different known machine learning frameworks like Theano or Tensorflow. The results obtained show that training time can be reduced by a factor of 30 if memory is used efficiently.

Razvan Benchea | Dragos Gavrilut | Dan-Georgian Marculet

[1] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.

[2] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[3] A. Kalinovsky,et al. Deep learning with theano, torch, caffe, tensorflow, and deeplearning4J: which one is the best in speed and accuracy? , 2016 .

[4] John Law,et al. Robust Statistics—The Approach Based on Influence Functions , 1986 .

[5] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.

[6] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[7] Thomas Paine,et al. GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training , 2013, ICLR.

[8] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .

[9] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[12] Mohak Shah,et al. Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning , 2015, ArXiv.

[13] Santosh K. Mishra,et al. De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..

[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.