Dynamic Self-assembling Petaflop Scale Clusters

High Performance Computing (HPC) has been studied and used in the scientific community for decades. The Message Passing Interface was first introduced in 1992. Similarly, commercial businesses have been relying on High Throughput Computing (HTC) for the past two decades. Mapreduce platforms became popular with the advent of Very Large Databases (VLDBs) and Big Data. We are now seeing the convergence between HPC and HTC to provide faster and cheaper parallel computation. The emergence of MPI as a scalable and viable parallel platform along with the acceptance of Mapreduce to tackle large data sets now opens the door to a host of new applications particularly in biomedical, public health, scientific, and health informatics research. This convergence is making it possible to have every device be a parallel node. In this paper we explore this convergence and a method for creating dynamic self-assembling clusters using commodity hardware and mobile devices.

[1]  Vijay S. Pande,et al.  Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problem , 2009, 0901.0866.

[2]  Siyue Sun,et al.  MMPI: A flexible and efficient multiprocessor message passing interface for NoC-based MPSoC , 2010, 23rd IEEE International SOC Conference.

[3]  Louis-A. Dessaint,et al.  Parallel computing environments and methods , 2000, Proceedings International Conference on Parallel Computing in Electrical Engineering. PARELEC 2000.

[4]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[5]  Alfonso Niño,et al.  A Survey of Parallel Programming Models and Tools in the Multi and Many-core Era , 2022 .

[6]  Greg R Andrews Foundations of Parallel and Distributed Programming , 1999 .

[7]  Jean-Marc Vincent,et al.  Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home , 2011, IEEE Transactions on Parallel and Distributed Systems.

[8]  Jack Dongarra,et al.  Sourcebook of parallel computing , 2003 .

[9]  Gregory R. Andrews,et al.  Foundations of Multithreaded, Parallel, and Distributed Programming , 1999 .

[10]  Christina Freytag,et al.  Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .

[11]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[12]  Rudolf Eigenmann,et al.  Programming Distributed Memory Sytems Using OpenMP , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[13]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[14]  S.J. Hussain,et al.  A Comparative Study and Analysis of PVM and MPI for Parallel and Distributed Systems , 2005, 2005 International Conference on Information and Communication Technologies.

[15]  Marc Tremblay,et al.  High-performance throughput computing , 2005, IEEE Micro.

[16]  Vijay S. Pande,et al.  Screen Savers of the World Unite! , 2000, Science.

[17]  Francis X. Diebold,et al.  A Personal Perspective on the Origin(s) and Development of 'Big Data': The Phenomenon, the Term, and the Discipline, Second Version , 2012 .

[18]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[19]  A. Stephen McGough,et al.  Energy-Aware Simulation of Workflow Execution in High Throughput Computing Systems , 2015, 2015 IEEE/ACM 19th International Symposium on Distributed Simulation and Real Time Applications (DS-RT).