Coarsening optimization for differentiable programming

This paper presents a novel optimization for differentiable programming named coarsening optimization. It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to much reduced runtime computations and data allocations in AD. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces phi-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification. Experiments on a collection of real-world applications show that coarsening optimization is effective in speeding up AD, producing several times to two orders of magnitude speedups.

[1]  Kyle A. Gallivan,et al.  A unified framework for nonlinear dependence testing and symbolic analysis , 2004, ICS '04.

[2]  Olivier Pironneau,et al.  Automatic differentiation in C++ using expression templates and. application to a flow control problem , 2001 .

[3]  Michael Innes,et al.  Don't Unroll Adjoint: Differentiating SSA-Form Programs , 2018, ArXiv.

[4]  Arthur B. Maccabe,et al.  The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages , 1990, PLDI '90.

[5]  Fei Wang,et al.  Demystifying differentiable programming: shift/reset the penultimate backpropagator , 2018, Proc. ACM Program. Lang..

[6]  David A. Padua,et al.  Gated SSA-based demand-driven symbolic analysis for parallelizing compilers , 1995, ICS '95.

[7]  Nicolas R. Gauger,et al.  High-Performance Derivative Computations using CoDiPack , 2017, ACM Trans. Math. Softw..

[8]  Xipeng Shen,et al.  GLORE: generalized loop redundancy elimination upon LER-notation , 2017, Proc. ACM Program. Lang..

[9]  Charles C. Margossian,et al.  A review of automatic differentiation and its efficient implementation , 2018, WIREs Data Mining Knowl. Discov..

[10]  Bart van Merrienboer,et al.  Automatic differentiation in ML: Where we are and where we should be going , 2018, NeurIPS.

[11]  Laurent Hascoët,et al.  The Data-Flow Equations of Checkpointing in Reverse Automatic Differentiation , 2006, International Conference on Computational Science.

[12]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[13]  Simon L. Peyton Jones,et al.  Efficient differentiable programming in a functional array-processing language , 2018, Proc. ACM Program. Lang..

[14]  Liam Paull,et al.  Kotlin∇: A shape-safe DSL for differentiable programming , 2019 .

[15]  Sören Laue On the Equivalence of Forward Mode Automatic Differentiation and Symbolic Differentiation , 2019, ArXiv.

[16]  Michael Carbin,et al.  𝜆ₛ: computable semantics for differentiable programming with higher-order functions and datatypes , 2021, Proc. ACM Program. Lang..

[17]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[18]  Mike Innes Sense & Sensitivities: The Path to General-Purpose Algorithmic Differentiation , 2020, MLSys.

[19]  Robin J. Hogan,et al.  Fast Reverse-Mode Automatic Differentiation using Expression Templates in C++ , 2014, ACM Trans. Math. Softw..

[20]  A. Griewank,et al.  Automatic differentiation of algorithms : theory, implementation, and application , 1994 .

[21]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[22]  Vivek Sarkar,et al.  Array SSA form and its use in parallelization , 1998, POPL '98.

[23]  Mark N. Wegman,et al.  An efficient method of computing static single assignment form , 1989, POPL '89.

[24]  Roger P. Pawlowski,et al.  Efficient Expression Templates for Operator Overloading-based Automatic Differentiation , 2012, ArXiv.

[25]  Dougal Maclaurin,et al.  Modeling, Inference and Optimization With Composable Differentiable Procedures , 2016 .

[26]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[27]  Nimar S. Arora,et al.  Bean Machine: A Declarative Probabilistic Programming Language For Efficient Programmable Inference , 2020, PGM.

[28]  Marcia Kilchenman O'Malley,et al.  Mathematical equations as executable models of mechanical systems , 2010, ICCPS '10.