Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization
暂无分享,去创建一个
[1] Pradeep Dubey,et al. 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Gerhard Wellein,et al. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[3] A Thesis,et al. Tiling Stencil Computations to Maximize Parallelism , 2013 .
[4] Thomas Ilsche,et al. An Energy Efficiency Feature Survey of the Intel Haswell Processor , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.
[5] R. B. Standler,et al. A frequency-dependent finite-difference time-domain formulation for dispersive materials , 1990 .
[6] Dennis M. Sullivan,et al. Frequency-dependent FDTD methods using Z transforms , 1992 .
[7] Gerhard Wellein,et al. Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model , 2014, ICS.
[8] Christoph Pflaum,et al. An iterative solver for the finite-difference frequency-domain (FDFD) method for the simulation of materials with negative permittivity , 2011, Numer. Linear Algebra Appl..
[9] David E. Keyes,et al. Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates , 2014, SIAM J. Sci. Comput..
[10] Christoph Pflaum,et al. Studying the effect of scattering layers on the efficiency of thin film solar cells , 2014, Numerical Simulation of Optoelectronic Devices, 2014.
[11] Jean-Pierre Berenger,et al. A perfectly matched layer for the absorption of electromagnetic waves , 1994 .
[12] Gerhard Wellein,et al. LIKWID: Lightweight Performance Tools , 2011, CHPC.
[13] Martin A. Green,et al. Solar cell efficiency tables (version 46) , 2015 .
[14] Gerhard Wellein,et al. Efficient Temporal Blocking for Stencil Computations by Multicore-Aware Wavefront Parallelization , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.
[15] Guang R. Gao,et al. Locality aware concurrent start for stencil applications , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[16] Om P. Gandhi,et al. A frequency-dependent finite-difference time-domain formulation for general dispersive media , 1993 .
[17] K. Yee. Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media , 1966 .
[18] Guang R. Gao,et al. Mapping the FDTD Application to Many-Core Chip Architectures , 2009, 2009 International Conference on Parallel Processing.
[19] R. J. Luebbers,et al. Piecewise linear recursive convolution for dispersive media using FDTD , 1996 .
[20] Christoph J. Brabec,et al. Numerical simulation of light propagation in silver nanowire films using time-harmonic inverse iterative method , 2013 .
[21] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[22] Bradley C. Kuszmaul,et al. The pochoir stencil compiler , 2011, SPAA '11.
[23] David E. Keyes,et al. Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations , 2015, ACM Trans. Parallel Comput..
[24] Roger W. Hockney,et al. F1/2: a Parameter to Characterize Memory and Communication Bottlenecks , 1989, Parallel Comput..
[25] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[26] A. Erdmann,et al. Finite integration (FI) method for modelling optical waves in lithography masks , 2009, 2009 International Conference on Electromagnetics in Advanced Applications.
[27] C. Leopold. Tight Bounds on Capacity Misses for 3D Stencil Codes , 2002 .
[28] Christoph Pflaum,et al. The SiSoFlex Project: Silicon Based Thin-Film Solar Cells on Flexible Aluminium Substrates , 2014 .
[29] Hans-Peter Seidel,et al. Cache Accurate Time Skewing in Iterative Stencil Computations , 2011, 2011 International Conference on Parallel Processing.
[30] Katherine Yelick,et al. Auto-tuning stencil codes for cache-based multicore platforms , 2009 .