A detailed comparative study of open source deep learning frameworks

Deep Learning (DL) is one of the hottest trends in machine learning as DL approaches produced results superior to the state-of-the-art in problematic areas such as image processing and natural language processing (NLP). To foster the growth of DL, several open source frameworks appeared providing implementations of the most common DL algorithms. These frameworks vary in the algorithms they support and in the quality of their implementations. The purpose of this work is to provide a qualitative and quantitative comparison among three of the most popular and most comprehensive DL frameworks (namely Google's TensorFlow, University of Montreal's Theano and Microsoft's CNTK). The ultimate goal of this work is to help end users make an informed decision about the best DL framework that suits their needs and resources. To ensure that our study is as comprehensive as possible, we conduct several experiments using multiple benchmark datasets from different fields (image processing, NLP, etc.) and measure the performance of the frameworks' implementations of different DL algorithms. For most of our experiments, we find out that CNTK's implementations are superior to the other ones under consideration.

[1]  D. Cox The Regression Analysis of Binary Sequences , 1958 .

[2]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[5]  A. G. Ivakhnenko,et al.  Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..

[6]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[7]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[8]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[11]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12]  Qiang Wang,et al.  Benchmarking State-of-the-Art Deep Learning Software Tools , 2016, 2016 7th International Conference on Cloud Computing and Big Data (CCBD).

[13]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Mohak Shah,et al.  Comparative Study of Deep Learning Software Frameworks , 2015, 1511.06435.

[16]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[17]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[18]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[19]  Takeo Kanade,et al.  First Results in Robot Road-Following , 1985, IJCAI.

[20]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[21]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Luca Maria Gambardella,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[24]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[25]  Yoshua Bengio,et al.  Learning deep physiological models of affect , 2013, IEEE Computational Intelligence Magazine.

[26]  Geoffrey Zweig,et al.  An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[29]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[30]  Mahmoud Al-Ayyoub,et al.  A comparative study of open source deep learning frameworks , 2018, 2018 9th International Conference on Information and Communication Systems (ICICS).

[31]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[32]  Ahmed Ali Mohammed,et al.  Integral transforms and their applications , 2009 .

[33]  Peter Goldsborough,et al.  A Tour of TensorFlow , 2016, ArXiv.

[34]  Yu Zhang,et al.  The Computational Network Toolkit [Best of the Web] , 2015, IEEE Signal Process. Mag..

[35]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[36]  A. Kalinovsky,et al.  Deep learning with theano, torch, caffe, tensorflow, and deeplearning4J: which one is the best in speed and accuracy? , 2016 .

[37]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[38]  Jürgen Schmidhuber,et al.  Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.

[39]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Alekseĭ Grigorʹevich Ivakhnenko,et al.  CYBERNETIC PREDICTING DEVICES , 1966 .

[41]  Graham W. Taylor,et al.  Theano-based Large-Scale Visual Recognition with Multiple GPUs , 2014, ICLR.

[42]  Jeff Johnson,et al.  Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.

[43]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.