A performance optimization framework for compilation of tensor contraction expressions into parallel
暂无分享,去创建一个
David E. Bernholdt | Robert J. Harrison | J. Ramanujam | Gerald Baumgartner | Chi-Chung Lam | P. Sadayappan | Daniel Cociorva | Marcel Nooijen
[1] Mithuna Thottethodi,et al. Tuning Strassen's Matrix Multiplication for Memory Efficiency , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[2] Ken Kennedy,et al. Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries , 2001, J. Parallel Distributed Comput..
[3] P. Kollman,et al. Encyclopedia of computational chemistry , 1998 .
[4] Keshav Pingali,et al. High-level semantic optimization of numerical codes , 1999, ICS '99.
[5] David A. Padua,et al. A MATLAB to Fortran 90 translator and its effectiveness , 1996, ICS '96.
[6] David E. Bernholdt,et al. Space-time trade-off optimization for a class of electronic structure calculations , 2002, PLDI '02.
[7] Chi-Chung Lam,et al. Optimization of a Class of Multi-Dimensional Integrals on Parallel Machines , 1997, PPSC.
[8] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[9] R. C. Whaley,et al. Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.
[10] Gerald Baumgartner,et al. Optimization of Memory Usage Requirement for a Class of Loops Implementing Multi-dimensional Integrals , 1999, LCPC.
[11] Chi-Chung Lam,et al. On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution , 1997, Parallel Process. Lett..
[12] Keshav Pingali,et al. A case for source-level transformations in MATLAB , 1999, DSL '99.
[13] J. Ramanujam,et al. Loop optimization for a class of memory-constrained computations , 2001, ICS '01.
[14] P. Schleyer. Encyclopedia of computational chemistry , 1998 .
[15] David A. Padua,et al. SPL: a language and compiler for DSP algorithms , 2001, PLDI '01.
[16] David E. Bernholdt,et al. Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization , 2001, HiPC.
[17] David A. Padua,et al. Searching for the Best FFT Formulas with the SPL Compiler , 2000, LCPC.
[18] Gustavo E. Scuseria,et al. Achieving Chemical Accuracy with Coupled-Cluster Theory , 1995 .
[19] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[20] Chi-Chung Lam,et al. Performance optimization of a class of loops implementing multidimensional integrals , 1999 .