PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

[1]  Philip S. Abrams,et al.  An APL machine , 1970 .

[2]  Hans-Hellmut Nagel,et al.  Automatic differentiation facilitates OF-integration into steering-angle-based road vehicle tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[3]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[4]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[5]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[6]  Dan Piponi,et al.  Automatic Differentiation, C++ Templates, and Photogrammetry , 2004, J. Graphics, GPU, & Game Tools.

[7]  Emery D. Berger,et al.  Quantifying the performance of garbage collection vs. explicit memory management , 2005, OOPSLA '05.

[8]  Jason Evans April A Scalable Concurrent malloc(3) Implementation for FreeBSD , 2006 .

[9]  Yann LeCun,et al.  EBLearn: Open-Source Energy-Based Learning in C++ , 2009, 2009 21st IEEE International Conference on Tools with Artificial Intelligence.

[10]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[11]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[12]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[13]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Andrew Lavin,et al.  maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs , 2015, ArXiv.

[16]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[17]  Amit Agarwal,et al.  CNTK: Microsoft's Open-Source Deep-Learning Toolkit , 2016, KDD.

[18]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[19]  Dougal Maclaurin,et al.  Modeling, Inference and Optimization With Composable Differentiable Procedures , 2016 .

[20]  Andrew Lavin,et al.  Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[22]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[23]  Tom Schaul,et al.  StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[24]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[25]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[26]  Nicolas Usunier,et al.  Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger , 2018, NeurIPS.