Process Placement in Multicore Clusters:Algorithmic Issues and Practical Techniques
暂无分享,去创建一个
Emmanuel Jeannot | Guillaume Mercier | Francois Tessier | E. Jeannot | Guillaume Mercier | François Tessier
[1] B. Brandfass,et al. Rank reordering for MPI communication optimization , 2013 .
[2] Jonathan Green,et al. Multi-core and Network Aware MPI Topology Functions , 2011, EuroMPI.
[3] Brian W. Kernighan,et al. An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..
[4] Takao Hatazaki,et al. Rank Reordering Strategy for MPI Topology Creation Functions , 1998, PVM/MPI.
[5] Naixue Xiong,et al. An approach for matching communication patterns in parallel applications , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[6] Torsten Hoefler,et al. Generic topology mapping strategies for large-scale parallel architectures , 2011, ICS '11.
[7] M. L. Norman,et al. Simulating Radiating and Magnetized Flows in Multiple Dimensions with ZEUS-MP , 2005, astro-ph/0511545.
[8] Hubert Ritzdorf,et al. The scalable process topology interface of MPI 2.2 , 2011, Concurr. Comput. Pract. Exp..
[9] Jesper Larsson Träff. Implementing the MPI process topology mechanism , 2002, SC '02.
[10] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[11] David H. Bailey,et al. NAS parallel benchmark results , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[12] Dhabaleswar K. Panda,et al. Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[13] Pavan Balaji,et al. Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems , 2011, Computer Science - Research and Development.
[14] Jean Roman,et al. SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.
[15] Emmanuel Jeannot,et al. Improving MPI Applications Performance on Multicore Clusters with Rank Reordering , 2011, EuroMPI.
[16] Guillaume Mercier,et al. hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.
[17] Guillaume Mercier,et al. Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem , 2007, Parallel Comput..
[18] Emmanuel Jeannot,et al. Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.
[19] Philippe Olivier Alexandre Navaux,et al. Multi-core aware process mapping and its impact on communication overhead of parallel applications , 2009, 2009 IEEE Symposium on Computers and Communications.
[20] Bruce Hendrickson,et al. The Chaco user`s guide. Version 1.0 , 1993 .
[21] José E. Moreira,et al. Blue Gene system software - Topology mapping for Blue Gene/L supercomputer , 2006, SC.
[22] Tomio Hirata,et al. Approximation Algorithms for the Weighted Independent Set Problem , 2005, WG.
[23] Thomas Hérault,et al. Process Distance-Aware Adaptive MPI Collective Communications , 2011, 2011 IEEE International Conference on Cluster Computing.
[24] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.
[25] José E. Moreira,et al. Topology Mapping for Blue Gene/L Supercomputer , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[26] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[27] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[28] B. Hendrickson. The Chaco User � s Guide Version , 2005 .
[29] Brian E. Smith,et al. Performance Effects of Node Mappings on the IBM BlueGene/L Machine , 2005, Euro-Par.
[30] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[31] Jin Zhang,et al. Process Mapping for MPI Collective Communications , 2009, Euro-Par.
[32] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.
[33] Jeffrey M. Squyres,et al. Locality-Aware Parallel Process Mapping for Multi-core HPC Systems , 2011, 2011 IEEE International Conference on Cluster Computing.
[34] F. Pellegrini,et al. Static mapping by dual recursive bipartitioning of process architecture graphs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[35] Hao Zhu,et al. Hierarchical Collectives in MPICH2 , 2009, PVM/MPI.
[36] Kenji Ono,et al. Automatically optimized core mapping to subdomains of domain decomposition method on multicore parallel environments , 2013 .