An Approach for Semiautomatic Locality Optimizations Using OpenMP
暂无分享,去创建一个
[1] P. Altena,et al. In search of clusters , 2007 .
[2] Michael Bader,et al. Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves , 2007, PPAM.
[3] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[4] Gregory Francis Pfister,et al. In search of clusters (2nd ed.) , 1998 .
[5] Guang R. Gao,et al. Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP , 2009, IWOMP.
[6] Steven J. Deitz,et al. High-level Language Support for User-defined Reductions , 2004, The Journal of Supercomputing.
[7] Gregory F. Pfister,et al. In Search of Clusters , 1995 .
[8] Sven-Bodo Scholz. On defining application-specific high-level array operations by means of shape-invariant programming facilities , 1999 .
[9] Keshav Pingali,et al. Tiling Imperfectly-nested Loop Nests (REVISED) , 2000 .
[10] Hugh Garraway. Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.
[11] Bronis R. de Supinski,et al. Evolving OpenMP in an Age of Extreme Parallelism, 5th International Workshop on OpenMP, IWOMP 2009, Dresden, Germany, June 3-5, 2009, Proceedings , 2009, IWOMP.
[12] Keshav Pingali,et al. Tiling Imperfectly-nested Loop Nests , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[13] Samuel Williams,et al. Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..
[14] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.