Algorithms for Ultrascale Systems

through all the software levels seems reasonable. In this article, we discuss the current research eorts and results related to energy eciency in the diverse areas of software. We conclude with open problems and

[1]  Nian-Feng Tzeng,et al.  Run-time Energy Consumption Estimation Based on Workload in Server Systems , 2008, HotPower.

[2]  Thomas F. Wenisch,et al.  MultiScale: memory system DVFS with multiple memory controllers , 2012, ISLPED '12.

[3]  Simon,et al.  Resource allocation to conserve energy in distributed computing , 2011, Int. J. Grid Util. Comput..

[4]  Gang Ren,et al.  Is Search Really Necessary to Generate High-Performance BLAS? , 2005, Proceedings of the IEEE.

[5]  Jens Lang,et al.  An execution time and energy model for an energy-aware execution of a conjugate gradient method with CPU/GPU collaboration , 2014, J. Parallel Distributed Comput..

[6]  Ricardo Bianchini,et al.  Energy conservation in heterogeneous server clusters , 2005, PPoPP.

[7]  Shin Gyu Kim,et al.  Energy-Centric DVFS Controling Method for Multi-core Platforms , 2012, SC Companion.

[8]  Zizhong Chen,et al.  A survey of power and energy efficient techniques for high performance numerical linear algebra operations , 2014, Parallel Comput..

[9]  Thomas Rauber,et al.  Tlib - a library to support programming with hierarchical multi-processor tasks , 2005, J. Parallel Distributed Comput..

[10]  R. Leupers,et al.  Compiler based exploration of DSP energy savings by SIMD operations , 2004, ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753).

[11]  Laurent Lefèvre,et al.  Beyond CPU Frequency Scaling for a Fine-grained Energy Control of HPC Systems , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[12]  Scott Shenker,et al.  Disk-Locality in Datacenter Computing Considered Irrelevant , 2011, HotOS.

[13]  Sujata Banerjee,et al.  ElasticTree: Saving Energy in Data Center Networks , 2010, NSDI.

[14]  Atri Rudra,et al.  Energy Aware Algorithmic Engineering , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[15]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[16]  Kai Ma,et al.  PGCapping: Exploiting power gating for power capping and core lifetime balancing in CMPs , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Zheng Shi,et al.  A Routing Protocol Based on Energy Aware in Ad Hoc Networks , 2010 .

[18]  Daniel S. Katz,et al.  Swift/T: Large-Scale Application Composition via Distributed-Memory Dataflow Processing , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[19]  Michael Schwind,et al.  Energy measurement, modeling, and prediction for processors with frequency scaling , 2014, The Journal of Supercomputing.

[20]  Takayasu Sakurai,et al.  Power gating: Circuits, design methodologies, and best practice for standard-cell VLSI designs , 2010, TODE.

[21]  Ying Wang,et al.  Automatic ARIMA modeling-based data aggregation scheme in wireless sensor networks , 2013, EURASIP Journal on Wireless Communications and Networking.

[22]  William Gropp,et al.  Programming for Exascale Computers , 2013, Computing in Science & Engineering.

[23]  Pramod K. Varshney,et al.  Data-aggregation techniques in sensor networks: a survey , 2006, IEEE Communications Surveys & Tutorials.

[24]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[25]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[26]  Ananta Tiwari,et al.  Auto-tuning for Energy Usage in Scientific Applications , 2011, Euro-Par Workshops.

[27]  Laurent Lefèvre,et al.  A survey on techniques for improving the energy efficiency of large-scale distributed systems , 2014, ACM Comput. Surv..

[28]  Niraj K. Jha,et al.  Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[29]  Albert Y. Zomaya,et al.  Energy-aware parallel task scheduling in a cluster , 2013, Future Gener. Comput. Syst..

[30]  Junfeng Yang,et al.  Stable Deterministic Multithreading through Schedule Memoization , 2010, OSDI.

[31]  Barbara M. Chapman,et al.  Analysis of Energy and Performance of PGAS-based Data Access Patterns , 2014, PGAS.

[32]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[33]  Maurizio Morisio,et al.  Exploring initial challenges for green software engineering: summary of the first GREENS workshop, at ICSE 2012 , 2013, SOEN.

[34]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Samuel Williams,et al.  PERI - auto-tuning memory-intensive kernels for multicore , 2008 .

[36]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[37]  Dhabaleswar K. Panda,et al.  Evaluation of Energy Characteristics of MPI Communication Primitives with RAPL , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[38]  Victor Eijkhout,et al.  Self-adapting numerical software (SANS) effort , 2006, IBM J. Res. Dev..

[39]  Mikko Majanen,et al.  Energy-aware job scheduler for high-performance computing , 2012, Computer Science - Research and Development.

[40]  Una-May O'Reilly,et al.  Siblingrivalry: online autotuning through local competitions , 2012, CASES '12.

[41]  Hermann Härtig,et al.  Overhead of a decentralized gossip algorithm on the performance of HPC applications , 2014, ROSS@ICS.

[42]  Thomas Rauber,et al.  Online auto-tuning for the time-step-based parallel solution of ODEs on shared-memory systems , 2014, J. Parallel Distributed Comput..

[43]  Kirk W. Cameron,et al.  The Optimist, the Pessimist, and the Global Race to Exascale in 20 Megawatts , 2012, Computer.

[44]  Henri Casanova,et al.  Algorithms and Scheduling Techniques for Exascale Systems (Dagstuhl Seminar 13381) , 2013, Dagstuhl Reports.

[45]  John Shalf,et al.  Exascale Computing Trends: Adjusting to the "New Normal"' for Computer Architecture , 2013, Computing in Science & Engineering.

[46]  Xiao Qin,et al.  Energy-Aware Duplication Strategies for Scheduling Precedence-Constrained Parallel Tasks on Clusters , 2006, 2006 IEEE International Conference on Cluster Computing.

[47]  Anthony A. Maciejewski,et al.  Efficient and scalable computation of the energy and makespan Pareto front for heterogeneous computing systems , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[48]  Gul A. Agha,et al.  Towards optimizing energy costs of algorithms for shared memory architectures , 2010, SPAA '10.

[49]  Yves Robert,et al.  Checkpointing Strategies with Prediction Windows , 2013, 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing.

[50]  Marin Litoiu,et al.  Performance model driven QoS guarantees and optimization in clouds , 2009, 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.

[51]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[52]  Meikang Qiu,et al.  Energy consumption analysis of parallel sorting algorithms running on multicore systems , 2012, 2012 International Green Computing Conference (IGCC).

[53]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[54]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[55]  Gokcen Kestor,et al.  Quantifying the energy cost of data movement in scientific applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[56]  Sanath S. Shenoy,et al.  Green software development model: An approach towards sustainable software development , 2011, 2011 Annual IEEE India Conference.

[57]  Wei Du,et al.  Energy-Aware Task Clustering Scheduling Algorithm for Heterogeneous Clusters , 2011, 2011 IEEE/ACM International Conference on Green Computing and Communications.

[58]  Iordanis Koutsopoulos,et al.  Measurement aggregation and routing techniques for energy-efficient estimation in wireless sensor networks , 2010, 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.

[59]  S. S. Salankar,et al.  Clock gating — A power optimizing technique for VLSI circuits , 2011, 2011 Annual IEEE India Conference.

[60]  Marcello Thiry,et al.  GreenRM: Reference Model for Sustainable Software Development , 2014, SEKE.

[61]  Jack J. Dongarra,et al.  Power monitoring with PAPI for extreme scale architectures and dataflow-based programming models , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[62]  Connie U. Smith,et al.  New Book - Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software , 2001, Int. CMG Conference.

[63]  Sally A. McKee,et al.  Real time power estimation and thread scheduling via performance counters , 2009, CARN.

[64]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[65]  Mohammad Abdel-Majeed,et al.  Warped gates: Gating aware scheduling and power gating for GPGPUs , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[66]  Uday Bondhugula,et al.  Compact multi-dimensional kernel extraction for register tiling , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[67]  Thomas Hérault,et al.  Extending the scope of the Checkpoint‐on‐Failure protocol for forward recovery in standard MPI , 2013, Concurr. Comput. Pract. Exp..

[68]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[69]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[70]  Kenli Li,et al.  A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems , 2014, The Journal of Supercomputing.

[71]  Waltenegus Dargie,et al.  A Stochastic Model for Estimating the Power Consumption of a Processor , 2015, IEEE Transactions on Computers.

[72]  Guang R. Gao,et al.  Optimizing the LU Factorization for Energy Efficiency on a Many-Core Architecture , 2013, LCPC.

[73]  Christian Belady,et al.  GREEN GRID DATA CENTER POWER EFFICIENCY METRICS: PUE AND DCIE , 2008 .

[74]  Christian Plessl,et al.  Runtime Resource Management in Heterogeneous System Architectures: The SAVE Approach , 2014, 2014 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[75]  Michael Schwind,et al.  Energy measurement and prediction for multi-threaded programs , 2014, SpringSim.

[76]  Connie U. Smith,et al.  Performance Engineering of Software Systems , 1990, SIGMETRICS Perform. Evaluation Rev..

[77]  David H. Bailey,et al.  NAS parallel benchmark results , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[78]  Austin Donnelly,et al.  Sierra: practical power-proportionality for data center storage , 2011, EuroSys '11.

[79]  Richard W. Vuduc,et al.  Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[80]  Massoud Pedram,et al.  Power and Performance Modeling in a Virtualized Server System , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[81]  Jack J. Dongarra,et al.  The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[82]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[83]  Nicola Mazzocca,et al.  Performance-driven development of a Web services application using MetaPL/HeSSE , 2005, 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[84]  Thomas Rauber,et al.  Modeling and analyzing the energy consumption of fork‐join‐based task parallel programs , 2015, Concurr. Comput. Pract. Exp..

[85]  Thomas Rauber,et al.  A Transformation Approach to Derive Efficient Parallel Implementations , 2000, IEEE Trans. Software Eng..

[86]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[87]  Thomas Rauber,et al.  Towards an Energy Model for Modular Parallel Scientific Applications , 2012, 2012 IEEE International Conference on Green Computing and Communications.

[88]  Katherine A. Yelick,et al.  Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.

[89]  Joaquín Pérez Ortega,et al.  Unveiling the performance‐energy trade‐off in iterative linear system solvers for multithreaded processors , 2015, Concurr. Comput. Pract. Exp..

[90]  I-Ling Yen,et al.  Qos-driven composition analysis for component-based system development , 2007 .

[91]  Kalman Graffi,et al.  Continuous Gossip-Based Aggregation through Dynamic Information Aging , 2013, 2013 22nd International Conference on Computer Communication and Networks (ICCCN).