论文信息 - EGGS: Sparsity‐Specific Code Generation

EGGS: Sparsity‐Specific Code Generation

Sparse matrix computations are among the most important computational patterns, commonly used in geometry processing, physical simulation, graph algorithms, and other situations where sparse data arises. In many cases, the structure of a sparse matrix is known a priori, but the values may change or depend on inputs to the algorithm. We propose a new methodology for compile‐time specialization of algorithms relying on mixing sparse and dense linear algebra operations, using an extension to the widely‐used open source Eigen package. In contrast to library approaches optimizing individual building blocks of a computation (such as sparse matrix product), we generate reusable sparsity‐specific implementations for a given algorithm, utilizing vector intrinsics and reducing unnecessary scanning through matrix structures. We demonstrate the effectiveness of our technique on a benchmark of artificial expressions to quantitatively evaluate the benefit of our approach over the state‐of‐the‐art library Intel MKL. To further demonstrate the practical applicability of our technique we show that our technique can improve performance, with minimal code changes, for mesh smoothing, mesh parametrization, volumetric deformation, optical flow, and computation of the Laplace operator.

[1] Elizabeth R. Jessup,et al. Reliable Generation of High-Performance Matrix Algebra , 2012, ACM Trans. Math. Softw..

[2] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[3] Mark Meyer,et al. Implicit fairing of irregular meshes using diffusion and curvature flow , 1999, SIGGRAPH.

[4] Shoaib Kamil,et al. OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[5] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.

[6] Qian Wang,et al. AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[7] Gaël Varoquaux,et al. The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[8] Elizabeth R. Jessup,et al. Automating the generation of composed linear algebra kernels , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[9] Aart J. C. Bik,et al. Compilation techniques for sparse matrix computations , 1993, ICS '93.

[10] Olaf Schenk,et al. Toward the Next Generation of Multiperiod Optimal Power Flow Solvers , 2018, IEEE Transactions on Power Systems.

[11] Peter Ahrens,et al. Tensor Algebra Compilation with Workspaces , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[12] Olaf Schenk,et al. Enhancing the scalability of selected inversion factorization algorithms in genomic prediction , 2017, J. Comput. Sci..

[13] Jan Fostier,et al. Needles: Toward Large-Scale Genomic Prediction with Marker-by-Environment Interaction , 2016, Genetics.

[14] Shoaib Kamil,et al. ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[15] Tomofumi Yuki,et al. Sparse computation data dependence simplification for efficient compiler-generated inspectors , 2019, PLDI.

[16] John Michael McNamee. Algorithm 408: a sparse matrix package (part I) [F4] , 1971, CACM.

[17] Wojciech Matusik,et al. Simit , 2016, ACM Trans. Graph..

[18] Paul Feautrier,et al. Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.

[19] Benoît Meister,et al. Polyhedral Optimization of TensorFlow Computation Graphs , 2017, ESPT/VPA@SC.

[20] J. W. Walker,et al. Direct solutions of sparse network equations by optimally ordered triangular factorization , 1967 .

[21] Hans-Peter Seidel,et al. Interactive multi-resolution modeling on arbitrary meshes , 1998, SIGGRAPH.

[22] Mary W. Hall,et al. Loop and data transformations for sparse matrix code , 2015, PLDI.

[23] Matthias Nießner,et al. Opt , 2016, ACM Trans. Graph..