An study of the effect of process malleability in the energy efficiency on GPU-based clusters
暂无分享,去创建一个
[1] Laxmikant V. Kalé,et al. Towards realizing the potential of malleable jobs , 2014, 2014 21st International Conference on High Performance Computing (HiPC).
[2] Krzysztof Rojek,et al. Machine learning method for energy reduction by utilizing dynamic mixed precision on GPU‐based supercomputers , 2019, Concurr. Comput. Pract. Exp..
[3] Dror G. Feitelson,et al. The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..
[4] Piotr K. Smolarkiewicz,et al. Multidimensional positive definite advection transport algorithm: an overview , 2006 .
[5] Gerassimos Barlas,et al. Multicore and GPU Programming: An Integrated Approach , 2014 .
[6] Srikumar Venugopal,et al. Architecting Malleable MPI Applications for Priority-driven Adaptive Scheduling , 2016, EuroMPI.
[7] Enrique S. Quintana-Ortí,et al. Modeling power consumption of 3D MPDATA and the CG method on ARM and Intel multicore architectures , 2017, The Journal of Supercomputing.
[8] Boleslaw K. Szymanski,et al. An Architecture for Reconfigurable Iterative MPI Applications in Dynamic Environments , 2005, PPAM.
[9] Rajesh Sudarsan,et al. Scheduling resizable parallel applications , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[10] Roman Wyrzykowski,et al. Performance modeling of 3D MPDATA simulations on GPU cluster , 2016, The Journal of Supercomputing.
[11] Sergio Iserte,et al. DMR API: Improving cluster productivity by turning applications into malleable , 2018, Parallel Comput..
[12] Sergio Iserte,et al. Efficient Scalable Computing through Flexible Applications and Adaptive Workloads , 2017, 2017 46th International Conference on Parallel Processing Workshops (ICPPW).
[13] Boleslaw K. Szymanski,et al. Malleable iterative MPI applications , 2009, Concurr. Comput. Pract. Exp..
[14] Andy B. Yoo,et al. Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .
[15] Hans-Joachim Bungartz,et al. Infrastructure and API Extensions for Elastic Execution of MPI Applications , 2016, EuroMPI.
[16] Jesús Labarta,et al. Collective Offload for Heterogeneous Clusters , 2015, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC).
[17] Dror G. Feitelson,et al. Packing Schemes for Gang Scheduling , 1996, JSSPP.
[18] Sergio Iserte,et al. Dynamic reconfiguration of noniterative scientific applications: A case study with HPG aligner , 2019, Int. J. High Perform. Comput. Appl..
[19] Lukasz Szustak,et al. Strategy for data-flow synchronizations in stencil parallel computations on multi-/manycore systems , 2018, The Journal of Supercomputing.
[20] Laxmikant V. Kalé,et al. A Batch System with Efficient Adaptive Scheduling for Malleable and Evolving Applications , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[21] Roman Wyrzykowski,et al. Systematic adaptation of stencil‐based 3D MPDATA to GPU architectures , 2017, Concurr. Comput. Pract. Exp..
[22] Martin Burtscher,et al. Measuring GPU Power with the K20 Built-in Sensor , 2014, GPGPU@ASPLOS.
[23] Johannes M. Dieterich,et al. Malleable parallelism with minimal effort for maximal throughput and maximal hardware load , 2019, Computational and Theoretical Chemistry.
[24] J. Prusa,et al. EULAG, a computational model for multiscale flows , 2008 .
[25] Jesús Carretero,et al. Enhancing the performance of malleable MPI applications by using performance-aware dynamic reconfiguration , 2015, Parallel Comput..