Power-aware applications for scientific cluster and distributed computing

The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton University, which provides HPC resources in a university context.

[1]  Rolf E. Andreassen,et al.  GooFit: A library for massively parallelising maximum-likelihood fits , 2013, ArXiv.

[2]  C D Jones,et al.  Stitched Together: Transitioning CMS to a Hierarchical Threaded Framework , 2014 .

[3]  Samuel H. Fuller,et al.  The Future of Computing Performance: Game Over or Next Level? , 2014 .

[4]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[5]  Ryszard S. Romaniuk,et al.  Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC , 2012 .

[6]  Alessandro Lonardo,et al.  Many-core applications to online track reconstruction in HEP experiments , 2013, ArXiv.

[7]  Peter Elmer,et al.  Initial explorations of ARM processors for scientific computing , 2013, ArXiv.

[8]  Andrew Hanushevsky,et al.  XROOTD/TXNetFile: a highly scalable architecture for data access in the ROOT environment , 2005, ICT 2005.

[9]  Peter Sanders,et al.  Parallel track reconstruction in CMS using the cellular automaton approach , 2014 .

[10]  D Emeliyanov,et al.  GPU-Based Tracking Algorithms for the ATLAS High-Level Trigger , 2012 .

[11]  Alessandro Lonardo,et al.  Applications of many-core technologies to on-line event reconstruction in High Energy Physics experiments , 2013, 2013 IEEE Nuclear Science Symposium and Medical Imaging Conference (2013 NSS/MIC).

[12]  P. Mato,et al.  Introducing concurrency in the Gaudi data processing framework , 2014 .

[13]  L Sargsyan,et al.  Experiment Dashboard - a generic, scalable solution for monitoring of the LHC computing activities, distributed sites and services , 2012 .

[14]  K. Hagiwara,et al.  Fast computation of MadGraph amplitudes on graphics processing unit (GPU) , 2013, The European Physical Journal C.

[15]  H. Gove,et al.  Annual Review Of Nuclear And Particle Science , 1984 .

[16]  Bruce M. Maggs,et al.  Cutting the electric bill for internet-scale systems , 2009, SIGCOMM '09.

[17]  Stephen J. Wright,et al.  Power Awareness in Network Design and Routing , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[18]  João Paulo Teixeira,et al.  The CMS experiment at the CERN LHC , 2008 .

[19]  C D Jones,et al.  Multi-core aware applications in CMS , 2011 .

[20]  J Mattmann,et al.  Track finding in ATLAS using GPUs , 2012 .

[21]  P. Elmer,et al.  XROOTD-A highly scalable architecture for data access , 2005 .

[22]  Peter Elmer,et al.  Explorations of the viability of ARM and Xeon Phi for physics processing , 2013, ArXiv.

[23]  I. Bird Computing for the Large Hadron Collider , 2011 .

[24]  C. Mariotti OBSERVATION OF A NEW BOSON AT A MASS OF 125 GEV WITH THE CMS EXPERIMENT AT THE LHC , 2015 .

[25]  Y. Wu,et al.  PhEDEx high-throughput data transfer management system , 2006 .

[26]  Xiaohan Ma,et al.  Statistical Power Consumption Analysis and Modeling for GPU-based Computing , 2011 .

[27]  Satoshi Matsuoka,et al.  Statistical power modeling of GPU kernels using performance counters , 2010, International Conference on Green Computing.

[28]  Cees T. A. M. de Laat,et al.  Profiling Energy Consumption of VMs for Green Cloud Computing , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.

[29]  Mateo Valero,et al.  Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[30]  Jim Nilsson,et al.  An in-depth look at computer performance growth , 2005, CARN.

[31]  Ben Couturier,et al.  Measurements of the LHCb software stack on the ARM architecture , 2014 .

[32]  Kenneth Bloom CMS Use of a Data Federation , 2014 .

[33]  Heath Skarlupka,et al.  IceCubes GPGPU's cluster for extensive MC production , 2012 .

[34]  Danilo Piparo,et al.  Development and Evaluation of Vectorised and Multi-Core Event Reconstruction Algorithms within the CMS Software Framework , 2012 .

[35]  P. J. Clark,et al.  Algorithm acceleration from GPGPUs for the ATLAS upgrade , 2011 .

[36]  L. Rossi,et al.  High Luminosity Large Hadron Collider : A description for the European Strategy Preparatory Group , 2012 .

[37]  V. Halyo,et al.  First evaluation of the CPU, GPGPU and MIC architectures for real time particle tracking based on Hough transform at the LHC , 2013, ArXiv.

[38]  Oystein Senneset Haaland,et al.  ALICE HLT High Speed Tracking on GPU , 2011, IEEE Transactions on Nuclear Science.

[39]  Xue Liu,et al.  Minimizing Electricity Cost: Optimization of Distributed Internet Data Centers in a Multi-Electricity-Market Environment , 2010, 2010 Proceedings IEEE INFOCOM.

[40]  Peter Elmer,et al.  The Need for an R&D and Upgrade Program for CMS Software and Computing , 2013, ArXiv.

[41]  Paola Grosso,et al.  A decision framework for placement of applications in clouds that minimizes their carbon footprint , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[42]  David Cussans,et al.  CMS Computing: Technical Design Report , 2005 .