Non-uniform Memory Affinity Strategy in Multi-Threaded Sparse Matrix Computations
暂无分享,去创建一个
[1] Jean-François Méhaut,et al. Memory Affinity for Hierarchical Shared Memory Multiprocessors , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.
[2] Laxmi N. Bhuyan,et al. Design and analysis of static memory management policies for CC-NUMA multiprocessors , 2002, J. Syst. Archit..
[3] Dirk Schmidl,et al. Data and thread affinity in openmp programs , 2008, MAW '08.
[4] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[5] Chao Yang,et al. Accelerating configuration interaction calculations for nuclear structure , 2008, HiPC 2008.
[6] Jesús Labarta,et al. Evaluation of the memory page migration influence in the system performance: the case of the SGI O2000 , 2003, ICS '03.
[7] Masha Sosonkina,et al. Dynamic Adaptations in ab-initio Nuclear Physics Calculations on Multicore Computer Architectures , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[8] Carla Schlatter Ellis,et al. Evaluation of NUMA Memory Management Through Modeling and Measurements , 1992, IEEE Trans. Parallel Distributed Syst..
[9] Christoph Lameter,et al. Local and Remote Memory: Memory in a Linux/NUMA System , 2006 .
[10] Rui Yang,et al. Profiling Directed NUMA Optimization on Linux Systems: A Case Study of the Gaussian Computational Chemistry Code , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[11] Georg Hager,et al. Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[12] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[13] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[14] Brice Goglin,et al. Enabling high-performance memory migration for multithreaded applications on LINUX , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[15] Frank Bellosa,et al. The Performance Limits of Locality Information Usage in Shared-Memory Multiprocessors , 1996, J. Parallel Distributed Comput..
[16] Michael Frumkin,et al. The OpenMP Implementation of NAS Parallel Benchmarks and its Performance , 2013 .
[17] Ryan E. Grant,et al. A Comprehensive Analysis of OpenMP Applications on Dual-Core Intel Xeon SMPs , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[18] Masha Sosonkina,et al. Scaling of ab-initio nuclear physics calculations on multicore computer architectures , 2010, ICCS.
[19] Joseph Antony,et al. Exploring Thread and Memory Placement on NUMA Architectures: Solaris and Linux, UltraSPARC/FirePlane and Opteron/HyperTransport , 2006, HiPC.
[20] Alexandra Fedorova,et al. A case for NUMA-aware contention management on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[21] Masha Sosonkina,et al. Accelerating Full Configuration Interaction Calculations for Nuclear Structure , 2008 .
[22] Tong Li,et al. Efficient operating system scheduling for performance-asymmetric multi-core architectures , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).