The Interplay Between Energy Efficiency and Resilience for Scalable High Performance Computing Systems
暂无分享,去创建一个
[1] Yifeng Guo,et al. Generalized Standby-Sparing techniques for energy-efficient fault tolerance in multiprocessor real-time systems , 2013, 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications.
[2] J. Duell. The design and implementation of Berkeley Lab's linux checkpoint/restart , 2005 .
[3] Thomas Hérault,et al. Algorithm-based fault tolerance for dense matrix factorizations , 2012, PPoPP '12.
[4] Mitsuhisa Sato,et al. Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[5] Vipin Kumar,et al. Isoefficiency: measuring the scalability of parallel algorithms and architectures , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[6] Zizhong Chen,et al. Performance of MPI broadcast algorithms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[7] Robert A. van de Geijn,et al. Collective communication on architectures that support simultaneous communication over multiple links , 2006, PPoPP '06.
[8] Shuaiwen Song,et al. Investigating the Interplay between Energy Efficiency and Resilience in High Performance Computing , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[9] John A. Gunnels,et al. Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin-Helmholtz instability , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[10] Jack Dongarra,et al. Distibuted Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA , 2011 .
[11] Dong Li,et al. Quantitatively Modeling Application Resilience with the Data Vulnerability Factor , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] Zizhong Chen,et al. Online-ABFT: an online algorithm based fault tolerance scheme for soft error detection in iterative methods , 2013, PPoPP '13.
[13] Zizhong Chen,et al. A survey of power and energy efficient techniques for high performance numerical linear algebra operations , 2014, Parallel Comput..
[14] H. Mair,et al. A 65-nm Mobile Multimedia Applications Processor with an Adaptive Power Management Scheme to Compensate for Variations , 2007, 2007 IEEE Symposium on VLSI Circuits.
[15] Laxmikant V. Kalé,et al. Assessing Energy Efficiency of Fault Tolerance Protocols for HPC Systems , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.
[16] James Demmel,et al. Communication-optimal parallel algorithm for strassen's matrix multiplication , 2012, SPAA '12.
[17] Frank Mueller,et al. ScalaBenchGen: Auto-Generation of Communication Benchmarks Traces , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[18] Li Tan,et al. Optimizing Energy Efficiency for Distributed Dense Matrix Factorizations via Utilizing Algorithmic Characteristics , 2014 .
[19] Martin Schulz,et al. Bounding energy consumption in large-scale MPI programs , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[20] Torsten Hoefler,et al. A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[21] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[22] Babak Falsafi,et al. Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[23] Xin Yuan,et al. Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.
[24] Mahmut T. Kandemir,et al. Exploiting barriers to optimize power consumption of CMPs , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[25] Rami G. Melhem,et al. The effects of energy management on reliability in real-time embedded systems , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..
[26] Haoqiang Jin,et al. Performance characteristics of the multi-zone NAS parallel benchmarks , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[27] Hsien-Hsin S. Lee,et al. Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era , 2008, Computer.
[28] Jian Li,et al. Power-efficient time-sensitive mapping in heterogeneous systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[29] Chris Fallin,et al. Memory power management via dynamic voltage/frequency scaling , 2011, ICAC '11.
[30] Thomas Rauber,et al. Automatic Tuning of PDGEMM Towards Optimal Performance , 2005, Euro-Par.
[31] Zizhong Chen,et al. Slow Down or Halt: Saving the Optimal Energy for Scalable HPC Systems , 2015, ICPE.
[32] Rong Ge,et al. Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[33] Wei Wu,et al. Reducing cache power with low-cost, multi-bit error-correcting codes , 2010, ISCA.
[34] Jian Li,et al. Power-performance considerations of parallel computing on chip multiprocessors , 2005, TACO.
[35] Franck Cappello,et al. ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance Protocols for HPC Applications , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.
[36] Ulrich Kremer,et al. The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.
[37] Mateo Valero,et al. Understanding the future of energy-performance trade-off via DVFS in HPC environments , 2012, J. Parallel Distributed Comput..
[38] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[39] Bronis R. de Supinski,et al. Adagio: making DVS practical for complex HPC applications , 2009, ICS.
[40] Qingyuan Deng,et al. MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.
[41] Ragunathan Rajkumar,et al. Critical power slope: understanding the runtime effects of frequency scaling , 2002, ICS '02.
[42] Mahmut T. Kandemir,et al. Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling , 2007, The Journal of Supercomputing.
[43] Wu-chun Feng,et al. A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[44] Shuaiwen Song,et al. Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[45] Vivek Sarkar,et al. Software challenges in extreme scale systems , 2009 .
[46] Bronis R. de Supinski,et al. Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[47] William Harrod. A journey to exascale computing , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[48] Rolf Riesen,et al. Evaluating energy savings for checkpoint/restart , 2013, E2SC '13.
[49] Enrique S. Quintana-Ortí,et al. Modeling power and energy of the task-parallel Cholesky factorization on multicore processors , 2012, Computer Science - Research and Development.
[50] Dong Li,et al. PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.
[51] Onur Mutlu,et al. Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.
[52] Manoj Sachdev,et al. Efficient adaptive voltage scaling system through on-chip critical path emulation , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).
[53] Radu Teodorescu,et al. Dynamic reduction of voltage margins by leveraging on-chip ECC in Itanium II processors , 2013, ISCA.
[54] Scott Shenker,et al. Scheduling for reduced CPU energy , 1994, OSDI '94.
[55] Wei Wang,et al. A continuous, analytic drain-current model for DG MOSFETs , 2004, IEEE Electron Device Letters.
[56] Wayne Luk,et al. Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters , 2010, 2010 International Conference on Field-Programmable Technology.
[57] Jd Hogg,et al. A DAG-based parallel Cholesky factorization for multicore systems , 2008 .
[58] Alan H. Karp,et al. Measuring parallel processor performance , 1990, CACM.
[59] James Demmel,et al. Improving communication performance in dense linear algebra via topology aware collectives , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[60] Jung Ho Ahn,et al. MAGE: Adaptive Granularity and ECC for resilient and power efficient memory systems , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[61] Jaeyoung Choi,et al. Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..
[62] Mitsuhisa Sato,et al. Emprical study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS in a Power-scalable High Performance Cluster , 2006, 2006 IEEE International Conference on Cluster Computing.
[63] Rong Ge,et al. Energy Efficient Parallel Matrix-Matrix Multiplication for DVFS-enabled Clusters , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[64] Xin Yuan,et al. CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters , 2003, PPoPP '03.
[65] Rong Ge,et al. Power-Aware Speedup , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[66] John W. Young,et al. A first order approximation to the optimum checkpoint interval , 1974, CACM.
[67] Jaeyoung Choi. A new parallel matrix multiplication algorithm on distributed-memory concurrent computers , 1998, Concurr. Pract. Exp..
[68] Shuaiwen Song,et al. Scalable Energy Efficiency with Resilience for High Performance Computing Systems , 2016, ACM Trans. Archit. Code Optim..
[69] Hiroto Yasuura,et al. Voltage scheduling problem for dynamically variable voltage processors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[70] Zizhong Chen,et al. FT-ScaLAPACK: correcting soft errors on-line for ScaLAPACK cholesky, QR, and LU factorization routines , 2014, HPDC '14.
[71] Efraim Rotem,et al. Energy Aware Race to Halt: A Down to EARtH Approach for Platform Energy Management , 2014, IEEE Computer Architecture Letters.
[72] Dhabaleswar K. Panda,et al. Power-Check: An Energy-Efficient Checkpointing Framework for HPC Clusters , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[73] David K. Lowenthal,et al. Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.
[74] David K. Lowenthal,et al. Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[75] Mahmut T. Kandemir,et al. Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[76] Rami G. Melhem,et al. Energy-aware checkpointing of divisible tasks with soft or hard deadlines , 2013, 2013 International Green Computing Conference Proceedings.
[77] Dong Li,et al. Hybrid MPI/OpenMP power-aware computing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[78] Sharad Malik,et al. EPROF: An energy/performance/reliability optimization framework for streaming applications , 2012, 17th Asia and South Pacific Design Automation Conference.
[79] Michael C. Huang,et al. The thrifty barrier: energy-aware synchronization in shared-memory multiprocessors , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[80] Qian Zhu,et al. Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[81] Dong Li,et al. A2E: Adaptively aggressive energy efficient DVFS scheduling for data intensive applications , 2013, 2013 IEEE 32nd International Performance Computing and Communications Conference (IPCCC).
[82] Massoud Pedram,et al. Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[83] Zhiling Lan,et al. Reliability-aware scalability models for high performance computing , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[84] Charles E. Leiserson,et al. On-the-Fly Pipeline Parallelism , 2015, ACM Trans. Parallel Comput..
[85] Dakai Zhu,et al. Energy Management for Real-Time Embedded Systems with Reliability Requirements , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.
[86] Antonia Zhai,et al. Energy efficient speculative threads: Dynamic thread allocation in same-ISA heterogeneous multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[87] Enrique S. Quintana-Ortí,et al. Reducing Energy Consumption of Dense Linear Algebra Operations on Hybrid CPU-GPU Platforms , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.
[88] John T. Daly,et al. A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..
[89] Lieven Eeckhout,et al. SWEEP: evaluating computer system energy efficiency using synthetic workloads , 2011, HiPEAC.
[90] David K. Lowenthal,et al. Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster , 2006, PPoPP '06.
[91] Jonathan Chang,et al. A 45 nm 8-Core Enterprise Xeon¯ Processor , 2009, IEEE Journal of Solid-State Circuits.
[92] Dong Li,et al. Strategies for Energy-Efficient Resource Management of Hybrid Programming Models , 2013, IEEE Transactions on Parallel and Distributed Systems.
[93] Rajiv Gupta,et al. Lightweight fault detection in parallelized programs , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[94] Andrew S. Cassidy,et al. Beyond Amdahl's Law: An Objective Function That Links Multiprocessor Performance Gains to Delay and Energy , 2012, IEEE Transactions on Computers.
[95] Hui Liu,et al. Optimizing Process-to-Core Mappings for Two Dimensional Broadcast/Reduce on Multicore Architectures , 2011, 2011 International Conference on Parallel Processing.
[96] Kuo-Chi Lin,et al. An incremental genetic algorithm approach to multiprocessor scheduling , 2004, IEEE Transactions on Parallel and Distributed Systems.
[97] D.K. Lowenthal,et al. Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[98] Dong Li,et al. HP-DAEMON: High Performance Distributed Adaptive Energy-efficient Matrix-multiplicatiON , 2014, ICCS.
[99] Rafael Mayo,et al. Analysis of Strategies to Save Energy for Message-Passing Dense Linear Algebra Kernels , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[100] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[101] Xue Liu,et al. Power-Aware CPU Utilization Control for Distributed Real-Time Systems , 2009, 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium.
[102] Xiang Cheng,et al. Reducing Operational Costs through Consolidation with Resource Prediction in the Cloud , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[103] Zizhong Chen,et al. TX: Algorithmic Energy Saving for Distributed Dense Matrix Factorizations , 2014, 2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems.
[104] Vincent Heuveline,et al. Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms , 2011, 2011 International Green Computing Conference and Workshops.
[105] Christine Morin,et al. Energy Management in IaaS Clouds: A Holistic Approach , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.
[106] Dong Li,et al. Improving performance and energy efficiency of matrix multiplication via pipeline broadcast , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).
[107] Hui Liu,et al. High performance linpack benchmark: a fault tolerant implementation without checkpointing , 2011, ICS '11.
[108] James H. Laros,et al. Metrics for Evaluating Energy Saving Techniques for Resilient HPC Systems , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[109] Rajkumar Buyya,et al. Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).
[110] Zhiyuan Wang. Reliability Speedup: An Effective Metric for Parallel Application with Checkpointing , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.
[111] Albert Y. Zomaya,et al. Some observations on optimal frequency selection in DVFS-based energy consumption minimization , 2011, J. Parallel Distributed Comput..
[112] Bruce Jacob,et al. A control-theoretic approach to dynamic voltage scheduling , 2003, CASES '03.
[113] Michael S. Hsiao,et al. Compiler-directed dynamic voltage/frequency scheduling for energy reduction in microprocessors , 2001, ISLPED '01.
[114] Rong Ge,et al. CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).
[115] Alaa R. Alameldeen,et al. Trading off Cache Capacity for Reliability to Enable Low Voltage Operation , 2008, 2008 International Symposium on Computer Architecture.
[116] Petru Eles,et al. Scheduling and voltage scaling for energy/reliability trade-offs in fault-tolerant time-triggered embedded systems , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).
[117] Shuaiwen Song,et al. A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[118] Alexandre Yakovlev,et al. Studying the Interplay of Concurrency, Performance, Energy and Reliability with ArchOn -- An Architecture-Open Resource-Driven Cross-Layer Modelling Framework , 2014, 2014 14th International Conference on Application of Concurrency to System Design.
[119] Enrique S. Quintana-Ortí,et al. Improving power efficiency of dense linear algebra algorithms on multi-core processors via slack control , 2011, 2011 International Conference on High Performance Computing & Simulation.
[120] Krishnendu Chakrabarty,et al. Energy-Aware Fault Tolerance in Fixed-Priority Real-Time Embedded Systems , 2003, ICCAD 2003.
[121] David Blaauw,et al. Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation , 2003, MICRO.