Addressing the Complexity of HPC in the Cloud: Emergence, Self-Organisation, Self-Management, and the Separation of Concerns

New use scenarios, workloads, and increased heterogeneity combined with rapid growth in adoption are increasing the management complexity of cloud computing at all levels. High performance computing (HPC) is a particular segment of the IT market that provides significant technical challenges for cloud service providers and exemplifies many of the challenges facing cloud service providers as they conceptualise the next generation of cloud architectures. This chapter introduces cloud computing, HPC, and the challenges of supporting HPC in the cloud. It discusses how heterogeneous computing and the concepts of self-organisation, self-management, and separation of concerns can be used to inform novel cloud architecture designs and support HPC in the cloud at hyperscale. Three illustrative application scenarios for HPC in the cloud—(i) oil and gas exploration, (ii) ray tracing, and (iii) genomics—are discussed.

[1]  Steffen Staab,et al.  Neurons, Viscose Fluids, Freshwater Polyp Hydra-and Self-Organizing Information Systems , 2003, IEEE Intell. Syst..

[2]  Roy Sterritt,et al.  Fulfilling the Vision of Autonomic Computing , 2010, Computer.

[3]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[4]  Kevin Dowd High Performance Computing , 2015, Communications in Computer and Information Science.

[5]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[6]  Kwang Mong Sim,et al.  Self-Organizing Agents for Service Composition in Cloud Computing , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[7]  Jian Peng,et al.  A Task Scheduling Algorithm Based on Improved Ant Colony Optimization in Cloud Computing Environment , 2011 .

[8]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[9]  Allen D. Malony,et al.  Performance measurement and modeling of component applications in a high performance computing environment: a case study , 2004 .

[10]  W. Ashby,et al.  Principles of the self-organizing dynamic system. , 1947, The Journal of general psychology.

[11]  Phil Rogers,et al.  Heterogeneous system architecture overview , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).

[12]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[13]  Carlos Gershenson,et al.  The Meaning of Self-organization in Computing , 2003 .

[14]  Victor Eijkhout,et al.  Introduction to High Performance Scientific Computing , 2015 .

[15]  Harold Abelson,et al.  Architects of the Information Society: 35 Years of the Laboratory for Computer Science at Mit , 1999 .

[16]  Kurt Geihs,et al.  Self-Management: The Solution to Complexity or Just Another Problem? , 2005, IEEE Distributed Syst. Online.

[17]  Winfried Lamersdorf,et al.  Systematically Engineering Self-Organizing Systems: The SodekoVS Approach , 2009, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[18]  Franco Zambonelli,et al.  Self-Organization in Multi Agent Systems: A Middleware Approach , 2003, Engineering Self-Organising Systems.

[19]  Hartmut Schmeck,et al.  Organic Computing - A Paradigm Shift for Complex Systems , 2011, Organic Computing.

[20]  Jeff Magee,et al.  Self-Managed Systems: an Architectural Challenge , 2007, Future of Software Engineering (FOSE '07).

[21]  Mohammad Reza Nami,et al.  A Survey of Autonomic Computing Systems , 2007 .

[22]  Karthikeyan Sankaralingam,et al.  Dark silicon and the end of multicore scaling , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[23]  Hartmut Schmeck,et al.  Organic Computing - A New Vision for Distributed Embedded Systems , 2005, ISORC.

[24]  Tom De Wolf,et al.  Emergence Versus Self-Organisation: Different Concepts but Promising When Combined , 2004, Engineering Self-Organising Systems.

[25]  Giovanna Di Marzo Serugendo,et al.  Self-organising Systems , 2011, Self-organising Software.

[26]  Kenli Li,et al.  vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines , 2012, IEEE Trans. Computers.

[27]  María Blanca Caminero,et al.  An Open-Source Framework for Integrating Heterogeneous Resources in Private Clouds , 2014, CLOSER.

[28]  Hsien-Hsin S. Lee,et al.  Using Mathematical Modeling in Provisioning a Heterogeneous Cloud Computing Environment , 2011, Computer.

[29]  Mohamed Zahran,et al.  Heterogeneous Computing: Here to Stay , 2016, ACM Queue.

[30]  Mike P. Papazoglou,et al.  Cloud Blueprints for Integrating and Managing Cloud Federations , 2012, Software Service and Application Engineering.

[31]  Francis Heylighen,et al.  Self-organization, Emergence and the Architecture of Complexity , 1989 .

[32]  Rajkumar Buyya,et al.  A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[33]  Torsten Wilde,et al.  A power-measurement methodology for large-scale, high-performance computing , 2014, ICPE.

[34]  Nitin,et al.  Load Balancing of Nodes in Cloud Using Ant Colony Optimization , 2012, 2012 UKSim 14th International Conference on Computer Modelling and Simulation.

[35]  Orran Krieger,et al.  Virtualization for high-performance computing , 2006, OPSR.

[36]  Regina Frei,et al.  Self-management for cloud computing , 2013, 2013 Science and Information Conference.

[37]  Won Kim,et al.  Cloud Computing: Today and Tomorrow , 2009, J. Object Technol..

[38]  Oskar Mencer,et al.  Surviving the end of frequency scaling with reconfigurable dataflow computing , 2011, CARN.

[39]  Stephen P. Crago,et al.  Heterogeneous Cloud Computing: The Way Forward , 2015, Computer.

[40]  Jun Kong,et al.  Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[41]  A. Turing The chemical basis of morphogenesis , 1952, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences.

[42]  Ivona Brandic Towards Self-Manageable Cloud Services , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.

[43]  Mikyung Kang,et al.  Heterogeneous Cloud Computing , 2011, 2011 IEEE International Conference on Cluster Computing.

[44]  Keqiu Li,et al.  Energy Consumption in Cloud Computing Data Centers , 2014, CloudCom 2014.

[45]  Hartmut Schmeck,et al.  Organic Computing – Addressing Complexity by Controlled Self-Organization , 2006, Second International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (isola 2006).

[46]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[47]  Thomas A. Corbi,et al.  The dawning of the autonomic computing era , 2003, IBM Syst. J..

[48]  Amar Shan,et al.  Heterogeneous processing: a strategy for augmenting moore's law , 2006 .

[49]  H. Van Dyke Parunak,et al.  Engineering Swarming Systems , 2004 .