Compilation of sparse array programming models

This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports element-wise application, reduction, and broadcasting of arbitrary functions over dense and sparse arrays with any fill value. Such a language has great expressive power and can express sparse and dense linear and tensor algebra, functions over images, exclusion and inclusion filters, and even graph algorithms. Our compiler strategy generalizes prior work in the literature on sparse tensor algebra compilation to support any function applied to sparse arrays, instead of only addition and multiplication. To achieve this, we generalize the notion of sparse iteration spaces beyond intersections and unions. These iteration spaces are automatically derived by considering how algebraic properties annotated onto functions interact with the fill values of the arrays. We then show how to compile these iteration spaces to efficient code. When compared with two widely-used Python sparse array packages, our evaluation shows that we generate built-in sparse array library features with a performance of 1.4× to 53.7× when measured against PyData/Sparse for user-defined functions and between 0.98× and 5.53× when measured against SciPy/Sparse for sparse array slicing. Our technique outperforms PyData/Sparse by 6.58× to 70.3×, and (where applicable) performs between 0.96× and 28.9× that of a dense NumPy implementation, on end-to-end sparse array applications. We also implement graph linear algebra kernels in our system with a performance of between 0.56× and 3.50× compared to that of the hand-optimized SuiteSparse:GraphBLAS library.

[1]  Shoaib Kamil,et al.  A sparse iteration space transformation framework for sparse tensor algebra , 2020, Proc. ACM Program. Lang..

[2]  Jaime Fern'andez del R'io,et al.  Array programming with NumPy , 2020, Nature.

[3]  Saman Amarasinghe,et al.  Automatic generation of efficient sparse tensor format conversion routines , 2020, PLDI.

[4]  Joel Nothman,et al.  SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[5]  TIMOTHY A. DAVIS,et al.  Algorithm 1000 , 2019, ACM Transactions on Mathematical Software.

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Frédo Durand,et al.  Taichi , 2019, ACM Trans. Graph..

[8]  Peter Ahrens,et al.  Tensor Algebra Compilation with Workspaces , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[9]  T. Davis Algorithm 1000: SuiteSparse: GraphBLAS: Graph Algorithms in the Language of Sparse Linear Algebra , 2019, ACM Trans. Math. Softw..

[10]  Saman P. Amarasinghe,et al.  Format abstraction for sparse tensor algebra compilers , 2018, Proc. ACM Program. Lang..

[11]  Hameer Abbasi,et al.  Sparse: A more modern sparse array library , 2018, SciPy.

[12]  Shoaib Kamil,et al.  The tensor algebra compiler , 2017, Proc. ACM Program. Lang..

[13]  Franz Franchetti,et al.  Mathematical foundations of the GraphBLAS , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[14]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[15]  Torsten Hoefler,et al.  Sparse Tensor Algebra as a Parallel Programming Model , 2015, ArXiv.

[16]  Mary W. Hall,et al.  Loop and data transformations for sparse matrix code , 2015, PLDI.

[17]  H ARITHA,et al.  A Boundary Detection in Medical Images using Edge Following Algorithm Based on Intensity Gradient and Texture Gradient Features , 2015 .

[18]  Michael Stonebraker,et al.  Standards for graph algorithm primitives , 2014, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[19]  María J. Ledesma-Carbayo,et al.  MIA - A free and open source software for gray scale medical image analysis , 2013, Source Code for Biology and Medicine.

[20]  Tinkara Toš,et al.  Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.

[21]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[22]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[23]  Wieslaw Lucjan Nowinski,et al.  A Medical Imaging and Visualization Toolkit in Java , 2004, Journal of Digital Imaging.

[24]  David Dagan Feng,et al.  A Web Based Medical Image Data Processing and Management System , 2000, VIP.

[25]  Keshav Pingali,et al.  A Relational Approach to the Compilation of Sparse Matrix Programs , 1997, Euro-Par.

[26]  Lawrence Snyder,et al.  ZPL: An Array Sublanguage , 1993, LCPC.

[27]  Aart J. C. Bik,et al.  Compilation techniques for sparse matrix computations , 1993, ICS '93.

[28]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[29]  Kenneth E. Iverson,et al.  A programming language , 1899, AIEE-IRE '62 (Spring).

[30]  J. W. Backus,et al.  The FORTRAN automatic coding system , 1899, IRE-AIEE-ACM '57 (Western).