Performance characterization and evaluation of parallel PDE solvers
暂无分享,去创建一个
[1] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[2] David E. Bernholdt,et al. Computational Quality of Service for Scientific Components , 2004, CBSE.
[3] P. Colella,et al. Local adaptive mesh refinement for shock hydrodynamics , 1989 .
[4] Scott H. Hawley,et al. Boson stars driven to the brink of black hole formation , 2000 .
[5] Brian J. N. Wylie,et al. Memory Profiling using Hardware Counters , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[6] James Arthur Kohl,et al. A Component Architecture for High-Performance Computing , 2003 .
[7] Erik Hagersten,et al. SIP: Performance Tuning through Source Code Interdependence , 2002, Euro-Par.
[8] Susan J. Eggers,et al. Eliminating False Sharing , 1991, ICPP.
[9] Erik Hagersten,et al. VASA: A Simulator Infrastructure with Adjustable Fidelity , 2005, IASTED PDCS.
[10] A.M. Wissink,et al. Large Scale Parallel Structured AMR Calculations Using the SAMRAI Framework , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[11] Jarmo Rantakokko. Partitioning strategies for structured multiblock grids , 2000, Parallel Comput..
[12] Peng Wang,et al. A New Generation EOS Compositional Reservoir Simulator: Part II - Framework and Multiprocessing , 1997 .
[13] Sverker Holmgren,et al. Cache Memory Behavior of Advanced PDE Solvers , 2003, PARCO.
[14] Alan Jay Smith,et al. Line (Block) Size Choice for CPU Cache Memories , 1987, IEEE Transactions on Computers.
[15] Ken Kennedy,et al. Improving Memory Hierarchy Performance through Combined Loop Interchange and Multi-Level Fusion , 2004, Int. J. High Perform. Comput. Appl..
[16] Michael Thuné,et al. Partitioning Strategies for Composite Grids , 1997, Parallel Algorithms Appl..
[17] J.C. Browne,et al. A Common Data Management Infrastructure for Adaptive Algorithms for PDE Solutions , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[18] Chau-Wen Tseng,et al. Data transformations for eliminating conflict misses , 1998, PLDI.
[19] Dinshaw S. Balsara,et al. Highly parallel structured adaptive mesh refinement using parallel language-based approaches , 2001, Parallel Comput..
[20] Jarmo Rantakokko,et al. A Framework for Partitioning Structured Grids with Inhomogeneous Workload , 1998, Parallel Algorithms Appl..
[21] David S. Johnson,et al. Some simplified NP-complete problems , 1974, STOC '74.
[22] Alain Darte. On the Complexity of Loop Fusion , 2000, Parallel Comput..
[23] Markus Kowarschik,et al. An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.
[24] Kamy Sepehrnoori,et al. A New Generation EOS Compositional Reservoir Simulator: Part I - Formulation and Discretization , 1997 .
[25] Manish Parashar,et al. An Application-Centric Characterization of Domain-Based SFC Partitioners for Parallel SAMR , 2002, IEEE Trans. Parallel Distributed Syst..
[26] Susan J. Eggers,et al. Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.
[27] Ralf Deiterding,et al. An improved bi-level algorithm for partitioning dynamic grid hierarchies , 2006 .
[28] Frederik Edelvik,et al. Hybrid Solvers for the Maxwell Equations in Time-Domain , 2002 .
[29] Greg L. Bryan,et al. Fluids in the universe: adaptive mesh refinement in cosmology , 1999, Comput. Sci. Eng..
[30] James Arthur Kohl,et al. A Component Architecture for High-Performance Scientific Computing , 2006, Int. J. High Perform. Comput. Appl..
[31] Manish Parashar,et al. Characterization of domain-based partitioners for parallel SAMR applications , 2000 .
[32] Johan Steensland. Efficient Partitioning of Dynamic Structured Grid Hierarchies , 2002 .
[33] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[34] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[35] Ulrich Rüde,et al. Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .
[36] Bradford Sturtevant,et al. Experiments on the Richtmyer-Meshkov instability of an air/SF6 interface , 1995 .
[37] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[38] Richard D. Hornung,et al. Enhancing scalability of parallel structured AMR calculations , 2003, ICS '03.
[39] Richard Wolski,et al. The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..
[40] Matthew W. Choptuik. Experiences with an adaptive mesh refinement algorithm in numerical relativity. , 1989 .
[41] Jaideep Ray,et al. A heuristic re-mapping algorithm reducing inter-level communication in SAMR applications. , 2003 .
[42] Antony Jameson,et al. How Many Steps are Required to Solve the Euler Equations of Steady, Compressible Flow: In Search of a Fast Solution Algorithm , 2001 .
[43] Scott Devine,et al. Using the SimOS machine simulator to study complex computer systems , 1997, TOMC.
[44] Anthony T. Chronopoulos,et al. s-step iterative methods for symmetric linear systems , 1989 .
[45] Manish Parashar,et al. Characterizing the Performance of Dynamic Distribution and Load-Balancing Techniques for Adaptive Grid Hierarchies , 1999 .
[46] Erik Hagersten,et al. Miss penalty reduction using bundled capacity prefetching in multiprocessors , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[47] Zhiling Lan,et al. Dynamic Load Balancing of SAMR Applications on Distributed Systems , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[48] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[49] Seung Ryoul Maeng,et al. An adaptive sequential prefetching scheme in shared-memory multiprocessors , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).
[50] Jeffrey K. Hollingsworth,et al. Using Hardware Performance Monitors to Isolate Memory Bottlenecks , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[51] Jarmo Rantakokko,et al. Algorithmic optimizations of a conjugate gradient solver on shared memory architectures , 2006, Int. J. Parallel Emergent Distributed Syst..
[52] Ralf Deiterding,et al. A virtual test facility for the efficient simulation of solid material response under strong shock and detonation wave loading , 2006, Engineering with Computers.
[53] Allen D. Malony,et al. Computational Quality of Service for Scientific CCA Applications: Composition, Substitution, and Reconfiguration , 2006 .
[54] James C. Browne,et al. On partitioning dynamic adaptive grid hierarchies , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.
[55] G. Bryan,et al. Cosmological Adaptive Mesh Refinement , 1998, astro-ph/9807121.
[56] James Arthur Kohl,et al. Parallel PDE-Based Simulations Using the Common Component Architecture , 2006 .
[57] Michel Dubois,et al. Sequential Hardware Prefetching in Shared-Memory Multiprocessors , 1995, IEEE Trans. Parallel Distributed Syst..
[58] Allen D. Malony,et al. PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[59] Michael L. Gittings,et al. MODELING THE 1958 LITUYA BAY MEGA-TSUNAMI, II , 2002 .
[60] Erik Hagersten,et al. StatCache: a probabilistic approach to efficient and accurate data locality analysis , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.
[61] Allen D. Malony,et al. Design and implementation of a parallel performance data management framework , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[62] Zhiling Lan,et al. A novel dynamic load balancing scheme for parallel systems , 2002, J. Parallel Distributed Comput..
[63] David A. Wood,et al. A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches , 1994, IEEE Trans. Computers.
[64] Sverker Holmgren,et al. Implementation Issues for High Performance CFD , 2004 .
[65] Thomas M. Conte,et al. Combining Trace Sampling with Single Pass Methods for Efficient Cache Simulation , 1998, IEEE Trans. Computers.
[66] Trevor N. Mudge,et al. Trace-driven memory simulation: a survey , 1997, CSUR.
[67] Alan Jay Smith,et al. Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.