An Experimental Investigation of Scalable Locality for Cluster Computing
暂无分享,去创建一个
Ian Burnette | Tim Douglas | Micah John Walter | David Wonnacott | D. Wonnacott | M. Walter | Ian Burnette | Tim Douglas
[1] Rudolf Eigenmann,et al. Optimizing OpenMP Programs on Software Distributed Shared Memory Systems , 2004, International Journal of Parallel Programming.
[2] Robert A. van de Geijn,et al. Satisfying your dependencies with SuperMatrix , 2007, 2007 IEEE International Conference on Cluster Computing.
[3] David G. Wonnacott,et al. Achieving Scalable Locality with Time Skewing , 2002, International Journal of Parallel Programming.
[4] David G. Wonnacott,et al. Using time skewing to eliminate idle time due to memory bandwidth and network limitations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[5] Ken Kennedy,et al. Estimating Interlock and Improving Balance for Pipelined Architectures , 1988, J. Parallel Distributed Comput..
[6] Rudolf Eigenmann,et al. Towards OpenMP Execution on Software Distributed Shared Memory Systems , 2002, ISHPC.
[7] Martin Griebl,et al. Automatic code generation for distributed memory architectures in the polytope model , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[8] Micah John Walter,et al. Experiences with expressing and optimizing dense numerical algorithms in AlphaZ , 2011 .
[9] Tim Douglas. An Empirical Study of the Performance of Scalable Locality on a Distributed Shared Memory System , 2011 .
[10] Uday Bondhugula,et al. Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model , 2008, CC.
[11] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[12] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[13] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[14] John D. McCalpin,et al. Time Skewing: A Value-Based Approach to Optimizing for Memory Locality , 1999 .
[15] Uday Bondhugula,et al. Effective automatic parallelization of stencil computations , 2007, PLDI '07.
[16] Rudolf Eigenmann,et al. Towards automatic translation of OpenMP to MPI , 2005, ICS '05.
[17] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.