A Hierarchical Approach for Load Balancing on Parallel Multi-core Systems
暂无分享,去创建一个
Laxmikant V. Kalé | Philippe Olivier Alexandre Navaux | Jean-François Méhaut | Laércio Lima Pilla | Chao Mei | Daniel Cordeiro | François Broquedis | Abhinav Bhatele | Christiane Pousa Ribeiro | L. Kalé | A. Bhatele | P. Navaux | François Broquedis | Chao Mei | J. Méhaut | L. Pilla | Daniel Cordeiro
[1] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[2] Thomas R. Gross,et al. Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead , 2011, ISMM '11.
[3] Laxmikant V. Kalé,et al. Overcoming scaling challenges in biomolecular simulations across multiple platforms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[4] Joseph Y.-T. Leung,et al. Handbook of Scheduling: Algorithms, Models, and Performance Analysis , 2004 .
[5] Laxmikant V. Kalé,et al. A Comparative Analysis of Load Balancing Algorithms Applied to a Weather Forecast Model , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.
[6] Amitabh Sinha,et al. Projections : A Preliminary Performance Tool for Charm , 2007 .
[7] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.
[8] Samuel Thibault,et al. Structuring the execution of OpenMP applications for multicore architectures , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[9] Laxmikant V. Kalé,et al. Optimizing a parallel runtime system for multicore clusters: a case study , 2010, TG.
[10] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[11] Laxmikant V. Kalé,et al. Dynamic topology aware load balancing algorithms for molecular dynamics applications , 2009, ICS.
[12] Franck Cappello,et al. Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..
[13] Bruno Raffin,et al. A Work Stealing Algorithm for Parallel Loops on Shared Cache Multicores , 2010 .
[14] Emmanuel Jeannot,et al. Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.
[15] John Kubiatowicz,et al. Juggle: proactive load balancing on multicore computers , 2011, HPDC '11.
[16] Jean Roman,et al. SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.
[17] Jean-François Méhaut,et al. Memory Affinity for Hierarchical Shared Memory Multiprocessors , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.
[18] Guillaume Mercier,et al. hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.