EGGS: Sparsity‐Specific Code Generation
暂无分享,去创建一个
Jinyang Li | Shoaib Kamil | Daniele Panozzo | Teseo Schneider | Aurojit Panda | Xuan Tang | Jinyang Li | Aurojit Panda | Daniele Panozzo | S. Kamil | T. Schneider | Xuan Tang
[1] Elizabeth R. Jessup,et al. Reliable Generation of High-Performance Matrix Algebra , 2012, ACM Trans. Math. Softw..
[2] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[3] Mark Meyer,et al. Implicit fairing of irregular meshes using diffusion and curvature flow , 1999, SIGGRAPH.
[4] Shoaib Kamil,et al. OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[5] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[6] Qian Wang,et al. AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[7] Gaël Varoquaux,et al. The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.
[8] Elizabeth R. Jessup,et al. Automating the generation of composed linear algebra kernels , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[9] Aart J. C. Bik,et al. Compilation techniques for sparse matrix computations , 1993, ICS '93.
[10] Olaf Schenk,et al. Toward the Next Generation of Multiperiod Optimal Power Flow Solvers , 2018, IEEE Transactions on Power Systems.
[11] Peter Ahrens,et al. Tensor Algebra Compilation with Workspaces , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[12] Olaf Schenk,et al. Enhancing the scalability of selected inversion factorization algorithms in genomic prediction , 2017, J. Comput. Sci..
[13] Jan Fostier,et al. Needles: Toward Large-Scale Genomic Prediction with Marker-by-Environment Interaction , 2016, Genetics.
[14] Shoaib Kamil,et al. ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Tomofumi Yuki,et al. Sparse computation data dependence simplification for efficient compiler-generated inspectors , 2019, PLDI.
[16] John Michael McNamee. Algorithm 408: a sparse matrix package (part I) [F4] , 1971, CACM.
[17] Wojciech Matusik,et al. Simit , 2016, ACM Trans. Graph..
[18] Paul Feautrier,et al. Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.
[19] Benoît Meister,et al. Polyhedral Optimization of TensorFlow Computation Graphs , 2017, ESPT/VPA@SC.
[20] J. W. Walker,et al. Direct solutions of sparse network equations by optimally ordered triangular factorization , 1967 .
[21] Hans-Peter Seidel,et al. Interactive multi-resolution modeling on arbitrary meshes , 1998, SIGGRAPH.
[22] Mary W. Hall,et al. Loop and data transformations for sparse matrix code , 2015, PLDI.
[23] Matthias Nießner,et al. Opt , 2016, ACM Trans. Graph..
[24] Paul Feautrier,et al. Array expansion , 1988, ICS '88.
[25] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[26] Victor Alessandrini. Intel Threading Building Blocks , 2016 .
[27] Michael Wolfe,et al. Optimizing supercompilers for supercomputers , 1989, ICS.
[28] Katherine Yelick,et al. Autotuning Sparse Matrix-Vector Multiplication for Multicore , 2012 .
[29] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[30] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[31] Conrad Sanderson,et al. Armadillo: An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments , 2010 .
[32] Olga Sorkine-Hornung,et al. Scalable locally injective mappings , 2017, TOGS.
[33] William Pugh,et al. SIPR: A New Framework for Generating Efficient Code for Sparse Matrix Computations , 1998, LCPC.
[34] Shoaib Kamil,et al. Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[35] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[36] Berthold K. P. Horn,et al. Determining Optical Flow , 1981, Other Conferences.
[37] Aart J. C. Bik,et al. On Automatic Data Structure Selection and Code Generation for Sparse Computations , 1993, LCPC.
[38] Philip Levis,et al. Ebb: A DSL for Physical Simluation on CPUs and GPUs , 2015, ACM Trans. Graph..
[39] Mary W. Hall,et al. The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code , 2018, Proceedings of the IEEE.
[40] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[41] Iain S. Duff,et al. An overview of the sparse basic linear algebra subprograms: The new standard from the BLAS technical forum , 2002, TOMS.
[42] Saman P. Amarasinghe,et al. Format abstraction for sparse tensor algebra compilers , 2018, Proc. ACM Program. Lang..
[43] Keshav Pingali,et al. A Relational Approach to the Compilation of Sparse Matrix Programs , 1997, Euro-Par.
[44] Gabriel Rodríguez,et al. Generating piecewise-regular code from irregular structures , 2019, PLDI.
[45] E WolfMichael,et al. A data locality optimizing algorithm , 1991 .
[46] Shoaib Kamil,et al. The tensor algebra compiler , 2017, Proc. ACM Program. Lang..