Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms
暂无分享,去创建一个
[1] Denis Trystram,et al. Decentralized list scheduling , 2011, Ann. Oper. Res..
[2] Bronis R. de Supinski,et al. Adagio: making DVS practical for complex HPC applications , 2009, ICS.
[3] Vijayalakshmi Srinivasan,et al. Special Issue on Network and Parallel Computing , 2015, International Journal of Parallel Programming.
[4] Tiziano De Matteis,et al. Proactive elasticity and energy awareness in data stream processing , 2017, J. Syst. Softw..
[5] Laurent Lefèvre,et al. Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms , 2017, 2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW).
[6] Laurent Lefèvre,et al. A survey on techniques for improving the energy efficiency of large-scale distributed systems , 2014, ACM Comput. Surv..
[7] Zhao Zhang,et al. Automatic runtime frequency-scaling system for energy savings in parallel applications , 2013, The Journal of Supercomputing.
[8] Dong Li,et al. Hybrid MPI/OpenMP power-aware computing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[9] Fabrice Rastello,et al. Using Data Dependencies to Improve Task-Based Scheduling Strategies on NUMA Architectures , 2016, Euro-Par.
[10] Dong Li,et al. Model-based, memory-centric performance and power optimization on NUMA multiprocessors , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).
[11] Marco Danelutto,et al. Mammut: High-level management of system knobs and sensors , 2017, SoftwareX.
[12] Samuel Thibault,et al. Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite , 2014, IWOMP.
[13] Alejandro Duran,et al. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP , 2009, 2009 International Conference on Parallel Processing.
[14] Thierry Gautier,et al. libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms , 2012, IWOMP.
[15] Allan Porterfield,et al. Using Dynamic Duty Cycle Modulation to Improve Energy Efficiency in High Performance Computing , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.
[16] Xu Yang,et al. Integrating dynamic pricing of electricity into energy aware scheduling for HPC systems , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[17] John Cavazos,et al. Using Per-Loop CPU Clock Modulation for Energy Efficiency in OpenMP Applications , 2015, 2015 44th International Conference on Parallel Processing.
[18] Marco Danelutto,et al. Efficient NAS Benchmark Kernels with C++ Parallel Programming , 2018, 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).
[19] Feng Pan,et al. Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.
[20] Jeremie Lagraviere,et al. Evaluation of the power efficiency of UPC, OpenMP and MPI , 2015 .
[21] Alejandro Duran,et al. Productive Programming of GPU Clusters with OmpSs , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[22] Manuel Prieto,et al. Survey of Energy-Cognizant Scheduling Techniques , 2013, IEEE Transactions on Parallel and Distributed Systems.
[23] Jack J. Dongarra,et al. Porting the PLASMA Numerical Library to the OpenMP Standard , 2017, International Journal of Parallel Programming.
[24] Thomas Ilsche,et al. Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel Processors , 2016, 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC).
[25] Stephen L. Olivier,et al. Power Measurement and Concurrency Throttling for Energy Reduction in OpenMP Programs , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[26] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[27] Mateo Valero,et al. Utilization driven power-aware parallel job scheduling , 2010, Computer Science - Research and Development.
[28] Ananta Tiwari,et al. Efficient speed (ES): Adaptive DVFS and clock modulation for energy efficiency , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).
[29] Robert Schöne,et al. Integrating performance analysis and energy efficiency optimizations in a unified environment , 2013, Computer Science - Research and Development.
[30] Bruno Raffin,et al. XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[31] Barbara M. Chapman,et al. ARCS: Adaptive Runtime Configuration Selection for Power-Constrained OpenMP Applications , 2016, 2016 IEEE International Conference on Cluster Computing (CLUSTER).
[32] Anne Benoit,et al. Shutdown Policies with Power Capping for Large Scale Computing Systems , 2017, Euro-Par.
[33] Gerhard Wellein,et al. LIKWID: Lightweight Performance Tools , 2011, CHPC.
[34] Dimitrios S. Nikolopoulos,et al. Online strategies for high-performance power-aware thread execution on emerging multiprocessors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[35] Kirk W. Cameron,et al. Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems , 2011, Int. J. High Perform. Comput. Appl..
[36] Yu David Liu,et al. Energy-efficient work-stealing language runtimes , 2014, ASPLOS.
[37] Xiao Zhang,et al. Hardware Execution Throttling for Multi-core Resource Management , 2009, USENIX Annual Technical Conference.
[38] Martin Schulz,et al. A Run-Time System for Power-Constrained HPC Applications , 2015, ISC.
[39] Barbara M. Chapman,et al. Power and Energy Footprint of OpenMP Programs Using OpenMP Runtime API , 2014, 2014 Energy Efficient Supercomputing Workshop.
[40] Efraim Rotem,et al. Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge , 2012, IEEE Micro.