SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra

Machine learning algorithms are commonly specified in linear algebra (LA). LA expressions can be rewritten into more efficient forms, by taking advantage of input properties such as sparsity, as well as program properties such as common subexpressions and fusible operators. The complex interaction among these properties' impact on the execution cost poses a challenge to optimizing compilers. Existing compilers resort to intricate heuristics that complicate the codebase and add maintenance cost but fail to search through the large space of equivalent LA expressions to find the cheapest one. We introduce a general optimization technique for LA expressions, by converting the LA expressions into Relational Algebra (RA) expressions, optimizing the latter, then converting the result back to (optimized) LA expressions. One major advantage of this method is that it is complete, meaning that any equivalent LA expression can be found using the equivalence rules in RA. The challenge is the major size of the search space, and we address this by adopting and extending a technique used in compilers, called equality saturation. We integrate the optimizer into SystemML and validate it empirically across a spectrum of machine learning tasks; we show that we can derive all existing hand-coded optimizations in SystemML, and perform new optimizations that lead to speedups from 1.2X to 5X.

[1]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[2]  Goetz Graefe The Cascades Framework for Query Optimization , 1995, IEEE Data Eng. Bull..

[3]  Dan Olteanu,et al.  Learning Linear Regression Models over Factorized Joins , 2016, SIGMOD Conference.

[4]  Pavel Panchekha,et al.  egg: Easy, Efficient, and Extensible E-graphs , 2020, ArXiv.

[5]  Dan Grossman,et al.  Using E-Graphs for CAD Parameter Inference , 2019, ArXiv.

[6]  Wolfgang Lehner,et al.  SpMacho - Optimizing Sparse Linear Algebra Expressions with Probabilistic Density Estimation , 2015, EDBT.

[7]  Hung Q. Ngo,et al.  In-Database Learning with Sparse Tensors , 2017, PODS.

[8]  Thierry Moreau,et al.  Learning to Optimize Tensor Programs , 2018, NeurIPS.

[9]  Shoaib Kamil,et al.  The tensor algebra compiler , 2017, Proc. ACM Program. Lang..

[10]  Pavel Panchekha,et al.  Automatically improving accuracy for floating point expressions , 2015, PLDI.

[11]  Albert Cohen,et al.  Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.

[12]  Mary W. Hall,et al.  The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code , 2018, Proceedings of the IEEE.

[13]  Charles Gregory Nelson,et al.  Techniques for program verification , 1979 .

[14]  Keith H. Randall,et al.  Denali: a goal-directed superoptimizer , 2002, PLDI '02.

[15]  Shivnath Babu,et al.  Cumulon: optimizing statistical data analysis in the cloud , 2013, SIGMOD '13.

[16]  Christoph Koch,et al.  Solving the Join Ordering Problem via Mixed Integer Linear Programming , 2015, SIGMOD Conference.

[17]  Kunle Olukotun,et al.  OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning , 2011, ICML.

[18]  Shoaib Kamil,et al.  Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code , 2018, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[19]  Guido Moerkotte,et al.  Errata for "Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products" , 2006, Proc. VLDB Endow..

[20]  Berthold Reinwald,et al.  On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML , 2018, Proc. VLDB Endow..

[21]  Julian J. McAuley,et al.  Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering , 2016, WWW.

[22]  Matthias Boehm,et al.  Apache SystemML , 2019, Encyclopedia of Big Data Technologies.

[23]  Todd J. Green,et al.  Containment of Conjunctive Queries on Annotated Relations , 2009, ICDT '09.

[24]  Dan Suciu,et al.  LaraDB: A Minimalist Kernel for Linear and Relational Algebra Computation , 2017, BeyondMR@SIGMOD.

[25]  Atri Rudra,et al.  FAQ: Questions Asked Frequently , 2015, PODS.

[26]  Shirish Tatikonda,et al.  SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning , 2017, CIDR.

[27]  Charles L. Forgy,et al.  Rete: A Fast Algorithm for the Many Patterns/Many Objects Match Problem , 1982, Artif. Intell..

[28]  Werner Nutt,et al.  Equivalences among aggregate queries with negation , 2005, TOCL.

[29]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[30]  Surajit Chaudhuri,et al.  Optimization of real conjunctive queries , 1993, PODS '93.

[31]  Guido Moerkotte,et al.  Dynamic programming strikes back , 2008, SIGMOD Conference.

[32]  Alexander Aiken,et al.  TASO: optimizing deep learning computation with automatic generation of graph substitutions , 2019, SOSP.

[33]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[34]  Michael Stepp,et al.  Equality saturation: a new approach to optimization , 2009, POPL '09.

[35]  Christopher Ré,et al.  AJAR: Aggregations and Joins over Annotated Relations , 2016, PODS.

[36]  Keshav Pingali,et al.  A Relational Approach to the Compilation of Sparse Matrix Programs , 1997, Euro-Par.

[37]  Arun Kumar,et al.  Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra , 2019, SIGMOD Conference.

[38]  Raghu Ramakrishnan,et al.  Containment of conjunctive queries: beyond relations as sets , 1995, TODS.