Theano: A Python framework for fast computation of mathematical expressions

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

John Salvatier | Razvan Pascanu | Sina Honari | Christopher Joseph Pal | Colin Raffel | Yoshua Bengio | Dumitru Erhan | Yann Dauphin | Ziye Fan | Justin Bayer | Kai Jia | Orhan Firat | Qianli Ma | Aaron C. Courville | Kyunghyun Cho | Sander Dieleman | Julien Demouth | Ying Zhang | Dmitriy Serdyuk | Laurent Dinh | Balázs Hidasi | Dzmitry Bahdanau | Kelvin Xu | Roland Memisevic | Jan Chorowski | Marc-Alexandre Côté | Harm de Vries | Guillaume Desjardins | Sebastian Urban | Francesco Visin | Vincent Dumoulin | Saizheng Zhang | César Laurent | Li Yao | Samira Ebrahimi Kahou | Vincent Michalski | Ian J. Goodfellow | Sean Lee | Gabriel Schwartz | David Warde-Farley | Mathieu Germain | Nicolas Ballas | Xavier Bouthillier | Samira Shabanian | John Schulman | Çaglar Gülçehre | Anatoly Belikov | Pascal Vincent | Mehdi Mirza | Matthew Graham | Guillaume Alain | Amjad Almahairi | Olivier Delalleau | Alex Lamb | Philippe Hamel | Matthew Willson | Mikhail Korobov | James Bergstra | Frédéric Bastien | Alexandre de Brébisson | Zhouhan Lin | Matthew Rocklin | Dustin J. Webb | Gijs van Tulder | Eric Larsen | Adriana Romero | Myriam Côté | Nicolas Boulanger-Lewandowski | Jesse A. Livezey | Tim Cooijmans | Bart van Merrienboer | Sigurd Spieckermann | Pascal Lamblin | Markus Roth | Arjun Jain | Paul F. Christiano | Xavier Glorot | Melanie Ducoffe | Pierre-Antoine Manzagol | Peter Sadowski | Jakub Sygnowski | Daniel Renshaw | Lijun Xue | Pierre Luc Carrier | Joseph P. Turian | Olivier Breuleux | Arnaud Bergeron | Rami Al-Rfou' | Vivek Kulkarni | Sébastien Jean | Christof Angermüller | Nicolas Bouchard | Olivier Mastropietro | Jan Schlüter | Simon Lefrançois | Mohammad Pezeshki | Nicholas Léonard | François Savard | Simon Lemieux | Iulian Serban | Étienne Simon | Alexander Belopolsky | Valentin Bisson | Josh Bleecher Snyder | Iban Harlouchet | Jean-Philippe Heng | Cory Lorenz | Jeremiah Lowin | Robert McGibbon | Alberto Orlandi | S. Ramana Subramanyam | Jérémie Tanguay | Xavier Glorot | Yoshua Bengio | Nicholas Léonard | Gijs van Tulder | J. Schulman | D. Erhan | Kyunghyun Cho | J. Bergstra | Pascal Lamblin | Olivier Delalleau | Pascal Vincent | Pierre-Antoine Manzagol | Çaglar Gülçehre | Razvan Pascanu | Justin Bayer | Guillaume Alain | M. Mirza | Rami Al-Rfou | Amjad Almahairi | Christof Angermüller | Dzmitry Bahdanau | Nicolas Ballas | Frédéric Bastien | A. Belikov | A. Belopolsky | Arnaud Bergeron | Valentin Bisson | Nicolas Bouchard | Nicolas Boulanger-Lewandowski | Xavier Bouthillier | A. D. Brébisson | Olivier Breuleux | P. Carrier | J. Chorowski | P. Christiano | Tim Cooijmans | Marc-Alexandre Côté | Myriam Côté | Yann Dauphin | Julien Demouth | Guillaume Desjardins | S. Dieleman | Laurent Dinh | Mélanie Ducoffe | Vincent Dumoulin | S. Kahou | Ziye Fan | Orhan Firat | M. Germain | M. Graham | P. Hamel | Iban Harlouchet | J. Heng | Balázs Hidasi | Sina Honari | Arjun Jain | Sébastien Jean | Kai Jia | Mikhail Korobov | Vivek Kulkarni | Alex Lamb | Eric Larsen | César Laurent | S. Lee | S. Lefrançois | S. Lemieux | Zhouhan Lin | J. Livezey | C. Lorenz | J. Lowin | Qianli Ma | Olivier Mastropietro | R. McGibbon | R. Memisevic | Vincent Michalski | A. Orlandi | C. Pal | M. Pezeshki | Colin Raffel | D. Renshaw | M. Rocklin | Adriana Romero | Markus Roth | Peter Sadowski | J. Salvatier | F. Savard | Jan Schlüter | Gabriel Schwartz | Iulian Serban | Dmitriy Serdyuk | Samira Shabanian | Étienne Simon | Sigurd Spieckermann | S. Subramanyam | Jakub Sygnowski | Jérémie Tanguay | S. Urban | Francesco Visin | David Warde-Farley | M. Willson | Kelvin Xu | Lijun Xue | Li Yao | Saizheng Zhang | Ying Zhang | Mehdi Mirza | Y. Dauphin | S. Honari | S. Shabanian | I. Goodfellow | John Schulman | B. V. Merrienboer | H. D. Vries

[1]  David E. Culler,et al.  Dataflow architectures , 1986 .

[2]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[3]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[4]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[5]  Pascal Vincent,et al.  A Common GPU n-Dimensional Array for Python and C , 2011 .

[6]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[7]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[8]  Razvan Pascanu,et al.  Theano: Deep Learning on GPUs with Python , 2012 .

[9]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Ian J. Goodfellow,et al.  Pylearn2: a machine learning research library , 2013, ArXiv.

[12]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[13]  R. Fergus,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[16]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[17]  Yoshua Bengio,et al.  Blocks and Fuel: Frameworks for deep learning , 2015, ArXiv.

[18]  Colin Raffel,et al.  Lasagne: First release. , 2015 .

[19]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[21]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[22]  Yann LeCun,et al.  Deep learning with Elastic Averaging SGD , 2014, NIPS.

[23]  Peter Kulchyski and , 2015 .

[24]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  J. Salvatier,et al.  Probabilistic programming in Python using PyMC3 , 2016, PeerJ Comput. Sci..

[27]  Tianqi Chen,et al.  Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.

[28]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[29]  Natalia Gimelshein,et al.  Virtualizing Deep Neural Networks for Memory-Efficient Neural Network Design , 2016, ArXiv.

[30]  Francesco Visin,et al.  A guide to convolution arithmetic for deep learning , 2016, ArXiv.