论文信息 - Adaptive HTF-MPR

Adaptive HTF-MPR

Deep neural networks are widely used in many artificial intelligence applications. They have demonstrated state-of-the-art accuracy on many artificial intelligence tasks. For this high accuracy to occur, deep neural networks require the right parameter values. This is achieved by a process known as training. The training of large amounts of data via many iterations comes at a high cost in regard to computation time and energy. Optimal resource allocation would therefore reduce the training time. TensorFlow, a computational graph library developed by Google, alleviates the development of neural network models and provides the means to train these networks. In this article, we propose Adaptive HTF-MPR to carry out the resource allocation, or mapping, on TensorFlow. Adaptive HTF-MPR searches for the best mapping in a hybrid approach. We applied the proposed methodology on two well-known image classifiers: VGG-16 and AlexNet. We also performed a full analysis of the solution space of MNIST Softmax. Our results demonstrate that Adaptive HTF-MPR outperforms the default homogeneous TensorFlow mapping. In addition to the speed up, Adaptive HTF-MPR can react to changes in the state of the system and adjust to an improved mapping.

[1] SchmidhuberJürgen. Deep learning in neural networks , 2015 .

[2] Pedro A. Diaz-Gomez,et al. Initial Population for Genetic Algorithms: A Metric Approach , 2007, GEM.

[3] J. van Leeuwen,et al. Graph Algorithms , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[4] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .

[5] H. Dette,et al. Detecting gradual changes in locally stationary processes , 2013, 1310.4678.

[6] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[7] D. Goldberg,et al. BOA: the Bayesian optimization algorithm , 1999 .

[8] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[9] Masoomeh Jasemi,et al. Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).

[10] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[11] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[14] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16] Nader Bagherzadeh,et al. HTF-MPR: A heterogeneous TensorFlow mapper targeting performance using genetic algorithms and gradient boosting regressors , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17] J. Friedman. Stochastic gradient boosting , 2002 .

[18] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19] Michael Pinedo,et al. Scheduling: Theory, Algorithms and Systems Development , 1992 .

[20] Eduardo C. Garrido-Merchán,et al. Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes , 2017, Neurocomputing.

[21] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[22] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.

[23] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[24] Dorothea Heiss-Czedik,et al. An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[25] Marcel Urner,et al. Handbook Of Theoretical Computer Science Vol A Algorithms And Complexity , 2016 .

[26] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] D. E. Knuth,et al. Postscript about NP-hard problems , 1974, SIGA.

[29] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[30] Jan van Leeuwen,et al. Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity , 1994 .

[31] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[32] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[33] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[34] Diane J. Cook,et al. A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[35] Karin Strauss,et al. Accelerating Deep Convolutional Neural Networks Using Specialized Hardware , 2015 .

[36] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.