Vectorized Higher Order Finite Difference Kernels
暂无分享,去创建一个
[1] Satoshi Matsuoka,et al. Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[2] Gerhard Wellein,et al. Efficient Temporal Blocking for Stencil Computations by Multicore-Aware Wavefront Parallelization , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.
[3] Christian Weiß,et al. Data locality optimizations for multigrid methods on structured grids , 2001 .
[4] Samuel Williams,et al. Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..
[5] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[6] Ulrich Rüde,et al. Optimising a 3D multigrid algorithm for the IA-64 architecture , 2008, Int. J. Comput. Sci. Eng..
[7] Paulius Micikevicius,et al. 3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.
[8] John D. McCalpin,et al. Time Skewing: A Value-Based Approach to Optimizing for Memory Locality , 1999 .
[9] Frank Mueller,et al. Auto-generation and auto-tuning of 3D stencil codes on GPU clusters , 2012, CGO '12.
[10] P. Sadayappan,et al. High-performance code generation for stencil computations on GPU architectures , 2012, ICS '12.
[11] Gerhard W. Zumbusch,et al. Tuning a Finite Difference Computation for Parallel Vector Processors , 2012, 2012 11th International Symposium on Parallel and Distributed Computing.
[12] Pradeep Dubey,et al. 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[13] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[14] Samuel Williams,et al. Optimization of geometric multigrid for emerging multi- and manycore processors , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.