Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates
暂无分享,去创建一个
[1] Franck Cappello,et al. Addressing failures in exascale computing , 2014, Int. J. High Perform. Comput. Appl..
[2] HammoudaAdam,et al. Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates , 2015 .
[3] Torsten Hoefler,et al. Using Simulation to Evaluate the Performance of Resilience Strategies at Scale , 2013, PMBS@SC.
[4] Martin Schulz,et al. Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[5] Ulrich Rüde,et al. Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .
[6] Masato Takeichi,et al. Formal derivation of efficient parallel programs by construction of list homomorphisms , 1997, TOPL.
[7] Viktor K. Prasanna,et al. High Performance Computing - HiPC 2005, 12th International Conference, Goa, India, December 18-21, 2005, Proceedings , 2005, HiPC.
[8] Luiz André Barroso,et al. The tail at scale , 2013, CACM.
[9] Albert Cohen,et al. Coarse-Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations , 2009, 2009 Eighth International Symposium on Parallel and Distributed Computing.
[10] Christel Baier,et al. Principles of Model Checking (Representation and Mind Series) , 2008 .
[11] Christel Baier,et al. Principles of model checking , 2008 .
[12] Katherine A. Yelick,et al. Communication avoiding and overlapping for numerical linear algebra , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[13] B GibbonsPhillip. ACM transactions on parallel computing , 2014 .
[14] Katherine Yelick,et al. Auto-tuning stencil codes for cache-based multicore platforms , 2009 .
[15] Kurt Mehlhorn,et al. Algorithms - ESA 2008, 16th Annual European Symposium, Karlsruhe, Germany, September 15-17, 2008. Proceedings , 2008, ESA.
[16] Dan Tsafrir,et al. System noise, OS clock ticks, and fine-grained parallel applications , 2005, ICS '05.
[17] Aditya Konduri,et al. Asynchronous finite-difference schemes for partial differential equations , 2014, J. Comput. Phys..
[18] Wu-chun Feng,et al. Making a Case for Efficient Supercomputing , 2003, ACM Queue.
[19] Allen D. Malony,et al. The ghost in the machine: observing the effects of kernel operation on parallel application performance , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[20] B. Fryxell,et al. FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .
[21] Torsten Hoefler,et al. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[22] Susan Coghlan,et al. The Influence of Operating Systems on the Performance of Collective Operations at Extreme Scale , 2006, 2006 IEEE International Conference on Cluster Computing.
[23] L. Ridgway Scott,et al. Scientific Parallel Computing , 2005 .
[24] John D. Davis,et al. Accounting for Variability in Large-Scale Cluster Power Models , 2011 .
[25] Christina Freytag,et al. Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .
[26] Satish Narayana Srirama,et al. Viability of the bulk synchronous parallel model for science on cloud , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).
[27] Kevin Skadron,et al. Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs , 2009, ICS.
[28] Nisheeth K. Vishnoi,et al. The Impact of Noise on the Scaling of Collectives: A Theoretical Approach , 2005, HiPC.
[29] F. Petrini,et al. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[30] David G. Wonnacott,et al. Using time skewing to eliminate idle time due to memory bandwidth and network limitations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[31] James Demmel,et al. Communication-optimal parallel algorithm for strassen's matrix multiplication , 2012, SPAA '12.
[32] Yifeng Chen,et al. Logic of global synchrony , 2001, TOPL.
[33] Anthony T. Chronopoulos,et al. s-step iterative methods for symmetric linear systems , 1989 .
[34] William W. Pugh,et al. Fine-grained analysis of array computations , 1998 .
[35] Leslie G. Valiant,et al. A bridging model for multi-core computing , 2008, J. Comput. Syst. Sci..
[36] Anthony Skjellum,et al. Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.
[37] John D. McCalpin,et al. Time Skewing: A Value-Based Approach to Optimizing for Memory Locality , 1999 .
[38] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[39] M. Snir,et al. Ghost Cell Pattern , 2010, ParaPLoP '10.
[40] Anthony Skjellum,et al. Using MPI: Portable Programming with the Message-Passing Interface , 1999 .
[41] Kevin T. Pedretti,et al. The impact of system design parameters on application noise sensitivity , 2010, 2010 IEEE International Conference on Cluster Computing.
[42] Samuel Williams,et al. Implicit and explicit optimizations for stencil computations , 2006, MSPC '06.
[43] Jeremy M. R. Martin,et al. Dynamic BSP : towards a flexible approach to parallel computing over the grid , 2004 .
[44] Carl E. Landwehr,et al. Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.
[45] Larry Carter,et al. Rescheduling for Locality in Sparse Matrix Computations , 2001, International Conference on Computational Science.
[46] Torsten Hoefler,et al. The Effect of Network Noise on Large-Scale Collective Communications , 2009, Parallel Process. Lett..
[47] Frédéric Loulergue,et al. Systematic Development of Correct Bulk Synchronous Parallel Programs , 2010, 2010 International Conference on Parallel and Distributed Computing, Applications and Technologies.
[48] Henry Hoffmann,et al. Patterns and statistical analysis for understanding reduced resource computing , 2010, OOPSLA.
[49] Alan Stewart. A programming model for BSP with partitioned synchronisation , 2010, Formal Aspects of Computing.
[50] Yun He,et al. A Ghost Cell Expansion Method for Reducing Communications in Solving PDE Problems , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[51] Ron Brightwell,et al. Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[52] James Demmel,et al. Avoiding communication in sparse matrix computations , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[53] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[54] Edward A. Ashcroft,et al. Proving Assertions about Parallel Programs , 1975, J. Comput. Syst. Sci..
[55] Gérard M. Baudet,et al. Asynchronous Iterative Methods for Multiprocessors , 1978, JACM.
[56] G. Allen,et al. Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[57] Jin-Soo Kim,et al. Relaxed Barrier Synchronization for the BSP Model of Computation on Message-Passing Architectures , 1998, Inf. Process. Lett..
[58] Pradipta De,et al. Impact of Noise on Scaling of Collectives: An Empirical Evaluation , 2006, HiPC.
[59] Albert Cohen,et al. Synchronization-Free Automatic Parallelization: Beyond Affine Iteration-Space Slicing , 2009, LCPC.
[60] Richard J. Lipton,et al. Reduction: a method of proving properties of parallel programs , 1975, CACM.
[61] Vasil P. Vasilev. BSPGRID: Variable Resources Parallel Computation and Multiprogrammed Parallelism , 2003, Parallel Process. Lett..
[62] John Shalf,et al. Abstract Machine Models and Proxy Architectures for Exascale Computing , 2014, 2014 Hardware-Software Co-Design for High Performance Computing.
[63] Vijayalakshmi Srinivasan,et al. Programming with relaxed synchronization , 2012, RACES '12.
[64] Volker Strumpen,et al. Cache oblivious stencil computations , 2005, ICS '05.