ZCCloud: Exploring Wasted Green Power for High-Performance Computing

In supercomputer centers, available power, cooling, or carbon footprint often limits supercomputer performance. We propose a new approach to continue scaling that avoids many of these limits, augmenting a traditional system with another that employs only "wasted" renewable power, stranded power. This excess power cannot be economically distributed through grid, and is only intermittently available. We call this approach Zero-carbon Cloud (ZCCloud). We explore the potential benefits of unreliable resources with production DOE HPC workloads using a simple periodic model, and identify job types that benefit most (capability jobs and on-time jobs). The benefits scale with duty factor and resource quantity. Next, to create realistic models of "stranded power" we study 28 months of Mid-continent Independent System Operator (MISO) power market history (1,259 generators, 77 million 5-minute intervals). We find that opportunity varies, but the best single wind site can provide 80% duty factor, and 20MW average stranded power. Combining sites further improves duty factor. With resource volatility models from the MISO study, we simulate production DOE HPC workloads and find that stranded power HPC, ZCCloud, can provide significant benefit, decreasing average job-wait time by 50%.

[1]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[2]  Andy Oram,et al.  Peer-to-Peer: Harnessing the Power of Disruptive Technologies , 2001 .

[3]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[4]  Andrew A. Chien,et al.  Entropia: architecture and performance of an enterprise desktop grid system , 2003, J. Parallel Distributed Comput..

[5]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[6]  Telecommunications Board The Future of Computing Performance: Game Over or Next Level? , 2011 .

[7]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[8]  Andrew A. Chien,et al.  The Zero-Carbon Cloud: High-Value, Dispatchable Demand for Renewable Power Generators , 2015 .

[9]  Xu Zhou,et al.  Underprovisioning the Grid Power Infrastructure for Green Datacenters , 2015, ICS.

[10]  M. Milligan,et al.  Integrating Variable Renewable Energy: Challenges and Solutions , 2013 .

[11]  Erich Strohmaier,et al.  TOP500 supercomputer , 2006, SC.

[12]  John Shalf,et al.  Exascale Computing Technology Challenges , 2010, VECPAR.

[13]  Jordi Torres,et al.  Building Green Cloud Services at Low Cost , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[14]  James F. Doyle,et al.  Peer-to-Peer: harnessing the power of disruptive technologies , 2001, UBIQ.

[15]  Osamu Kimura,et al.  Saving Electricity in a Hurry : A Japanese Experience after the Great East Japan Earthquake in 2011 , 2013 .

[16]  Thu D. Nguyen,et al.  GreenPar: Scheduling Parallel High Performance Applications in Green Datacenters , 2015, ICS.

[17]  Gregory A. Koenig,et al.  The Electrical Grid and Supercomputer Centers: An Investigative Analysis of Emerging Opportunities and Challenges , 2014 .

[18]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[19]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[20]  Depei Qian,et al.  Chameleon: Adapting throughput server to time-varying green power budget using online learning , 2013, International Symposium on Low Power Electronics and Design (ISLPED).

[21]  Lennart Söder,et al.  Wind and solar curtailment , 2013 .

[22]  L. Bird,et al.  Wind and Solar Energy Curtailment: Experience and Practices in the United States , 2014 .

[23]  Ward Whitt,et al.  Transient behavior of the M/M/l queue: Starting at the origin , 1987, Queueing Syst. Theory Appl..

[24]  Andrew A. Chien,et al.  Scaling Supercomputing with Stranded Power : Costs and Capabilities , 2016 .

[25]  Nanning Zheng,et al.  HEB: Deploying and managing hybrid energy buffers for improving datacenter efficiency and economy , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).