Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions

Both distributed systems and multicore computers are difficult programming environments. Although the expert programmer may be able to tune distributed and multicore computers to achieve high performance, the non-expert may struggle to achieve a program that even functions correctly. We argue that high level abstractions are an effective way of making parallel computing accessible to the non-expert. An abstraction is a regularly structured framework into which a user may plug in simple sequential programs to create very large parallel programs. By virtue of a regular structure and declarative specification, abstractions may be materialized on distributed, multicore, and distributed multicore systems with robust performance across a wide range of problem sizes. In previous work, we presented the All-Pairs abstraction for computing on distributed systems of single CPUs. In this paper, we extend All-Pairs to multicore systems, and introduce Wavefront, which represents a number of problems in economics and bioinformatics. We demonstrate good scaling of both abstractions up to 32-cores on one machine and hundreds of cores in a distributed system.

[1]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[2]  Jennifer F. Reinganum,et al.  Oligopoly Extraction of a Common Property Natural Resource: The Importance of the Period of Commitment in Dynamic Games , 1985 .

[3]  Francisco Vilar Brasileiro,et al.  Trading Cycles for Information: Using Replication to Schedule Bag-of-Tasks Applications on Computational Grids , 2003, Euro-Par.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Bertil Schmidt,et al.  Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW , 2005, Bioinform..

[6]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[7]  M. Spence The Learning Curve and Competition , 1981 .

[8]  Srinivas Aluru,et al.  Space and time optimal parallel sequence alignments , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[9]  Richard D. Schlichting,et al.  Tolerating failures in the bag-of-tasks programming paradigm , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[10]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[11]  M. Spence Cost Reduction, Competition and Industry Performance , 1984 .

[12]  Guang R. Gao,et al.  An efficient parallel algorithm for all pairs examination , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[13]  Leslie G. Valiant,et al.  Bulk synchronous parallel computing-a paradigm for transportable software , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[14]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[15]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[16]  H. T. Kung Why systolic architectures? , 1982, Computer.

[17]  Jennifer F. Reinganum Dynamic games of innovation , 1981 .

[18]  Nitesh V. Chawla,et al.  Scaling up Classifiers to Cloud Computers , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  Yong Zhao,et al.  A notation and system for expressing and executing cleanly typed workflows on messy scientific data , 2005, SGMD.

[20]  M. Spence,et al.  Learning Curve Spillovers and Market Performance , 1985 .

[21]  Jennifer F. Reinganum A DYNAMIC GAME OF R AND D: PATENT PROTECTION AND COMPETITIVE BEHAVIOR' , 1982 .

[22]  U. Doraszelski An R&D Race with Knowledge Accumulation , 2003 .

[23]  Douglas Thain,et al.  All-pairs: An abstraction for data-intensive cloud computing , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[24]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[25]  Miron Livny,et al.  Condor and the Grid , 2003 .

[26]  Srinivas Aluru,et al.  Parallel biological sequence alignments on the Cell Broadband Engine , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.