Adaptive HTF-MPR

Deep neural networks are widely used in many artificial intelligence applications. They have demonstrated state-of-the-art accuracy on many artificial intelligence tasks. For this high accuracy to occur, deep neural networks require the right parameter values. This is achieved by a process known as training. The training of large amounts of data via many iterations comes at a high cost in regard to computation time and energy. Optimal resource allocation would therefore reduce the training time. TensorFlow, a computational graph library developed by Google, alleviates the development of neural network models and provides the means to train these networks. In this article, we propose Adaptive HTF-MPR to carry out the resource allocation, or mapping, on TensorFlow. Adaptive HTF-MPR searches for the best mapping in a hybrid approach. We applied the proposed methodology on two well-known image classifiers: VGG-16 and AlexNet. We also performed a full analysis of the solution space of MNIST Softmax. Our results demonstrate that Adaptive HTF-MPR outperforms the default homogeneous TensorFlow mapping. In addition to the speed up, Adaptive HTF-MPR can react to changes in the state of the system and adjust to an improved mapping.

[1]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[2]  Pedro A. Diaz-Gomez,et al.  Initial Population for Genetic Algorithms: A Metric Approach , 2007, GEM.

[3]  J. van Leeuwen,et al.  Graph Algorithms , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[4]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[5]  H. Dette,et al.  Detecting gradual changes in locally stationary processes , 2013, 1310.4678.

[6]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[7]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[8]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[9]  Masoomeh Jasemi,et al.  Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).

[10]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Nader Bagherzadeh,et al.  HTF-MPR: A heterogeneous TensorFlow mapper targeting performance using genetic algorithms and gradient boosting regressors , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  J. Friedman Stochastic gradient boosting , 2002 .

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms and Systems Development , 1992 .

[20]  Eduardo C. Garrido-Merchán,et al.  Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes , 2017, Neurocomputing.

[21]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[22]  James Demmel,et al.  ImageNet Training in Minutes , 2017, ICPP.

[23]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[24]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[25]  Marcel Urner,et al.  Handbook Of Theoretical Computer Science Vol A Algorithms And Complexity , 2016 .

[26]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  D. E. Knuth,et al.  Postscript about NP-hard problems , 1974, SIGA.

[29]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[30]  Jan van Leeuwen,et al.  Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity , 1994 .

[31]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[32]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[33]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[34]  Diane J. Cook,et al.  A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[35]  Karin Strauss,et al.  Accelerating Deep Convolutional Neural Networks Using Specialized Hardware , 2015 .

[36]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.