Algorithm 942
暂无分享,去创建一个
[1] Mark F. Adams,et al. Chombo Software Package for AMR Applications Design Document , 2014 .
[2] Volker Strumpen,et al. The Cache Complexity of Multithreaded Cache Oblivious Algorithms , 2009, SPAA '06.
[3] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[4] José María Cela,et al. Introducing the Semi-stencil Algorithm , 2009, PPAM.
[5] Mauricio Araya-Polo,et al. Towards a Multi-Level Cache Performance Model for 3D Stencil Computation , 2011, ICCS.
[6] Catherine de Groot-Hedlin,et al. A FINITE DIFFERENCE SOLUTION TO THE HELMHOLTZ EQUATION IN A RADIALLY SYMMETRIC WAVEGUIDE: APPLICATION TO NEAR-SOURCE SCATTERING IN OCEAN ACOUSTICS , 2008 .
[7] Samuel Williams,et al. Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..
[8] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[9] George Ho,et al. PAPI: A Portable Interface to Hardware Performance Counters , 1999 .
[10] X. Andrade,et al. Efficient formalism for large-scale ab initio molecular dynamics based on time-dependent density functional theory. , 2007, Physical review letters.
[11] Samuel Williams,et al. Auto-Tuning Stencil Computations on Multicore and Accelerators , 2010, Scientific Computing with Multicore and Accelerators.
[12] Mauricio Hanzich,et al. 3D seismic imaging through reverse-time migration on homogeneous and heterogeneous multi-core processors , 2009, Sci. Program..
[13] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[14] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[15] Georg Hager,et al. Introducing a Performance Model for Bandwidth-Limited Loop Kernels , 2009, PPAM.
[16] Axel Brandenburg,et al. Computational aspects of astrophysical MHD and turbulence , 2001, Advances in Nonlinear Dynamos.
[17] Gerhard Wellein,et al. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[18] John D. McCalpin,et al. Time Skewing: A Value-Based Approach to Optimizing for Memory Locality , 1999 .
[19] H. Appel,et al. octopus: a tool for the application of time‐dependent density functional theory , 2006 .
[20] Volker Strumpen,et al. Cache oblivious stencil computations , 2005, ICS '05.
[21] Gerhard Wellein,et al. Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures , 2010 .
[22] George A. McMechan,et al. A review of seismic acoustic imaging by reverse‐time migration , 1989, Int. J. Imaging Syst. Technol..
[23] Patrick R. Amestoy,et al. 3D Frequency-domain Finite-difference Modeling of Acoustic Wave Propagation Using a Massively Parallel Direct Solver: a Feasibility Study , 2005 .
[24] David G. Wonnacott,et al. Time Skewing for Parallel Computers , 1999, LCPC.
[25] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[26] Olivier Temam,et al. Cache interference phenomena , 1994, SIGMETRICS.
[27] A. Prieto,et al. Perfectly matched layers for modelling seismic oceanography experiments , 2008 .
[28] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[29] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[30] Vicki H. Allan,et al. Software pipelining , 1995, CSUR.
[31] Anne Rogers,et al. Software support for speculative loads , 1992, ASPLOS V.
[32] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[33] Todd C. Mowry,et al. Tolerating latency through software-controlled data prefetching , 1994 .
[34] Leonid Oliker,et al. Impact of modern memory subsystems on cache optimizations for stencil computations , 2005, MSP '05.
[35] Samuel Williams,et al. Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .
[36] Samuel Williams,et al. Implicit and explicit optimizations for stencil computations , 2006, MSPC '06.