Parallel algorithms

The subject of this chapter is the design and analysis of parallel algorithms. Most of today's algorithms are sequential, that is, they specify a sequence of steps in which each step consists of a single operation. These algorithms are well suited to today's computers, which basically perform operations in a sequential fashion. Although the speed at which sequential computers operate has been improving at an exponential rate for many years, the improvement is now coming at greater and greater cost. As a consequence, researchers have sought more cost-e ective improvements by building \parallel" computers { computers that perform multiple operations in a single step. In order to solve a problem e ciently on a parallel machine, it is usually necessary to design an algorithm that speci es multiple operations on each step, i.e., a parallel algorithm. As an example, consider the problem of computing the sum of a sequence A of n numbers. The standard algorithm computes the sum by making a single pass through the sequence, keeping a running sum of the numbers seen so far. It is not di cult however, to devise an algorithm for computing the sum that performs many operations in parallel. For example, suppose that, in parallel, each element of A with an even index is paired and summed with the next element of A, which has an odd index, i.e., A[0] is paired with A[1], A[2] with A[3], and so on. The result is a new sequence of dn=2e numbers that sum to the same value as the sum that we wish to compute. This pairing and summing step can be repeated until, after dlog2 ne steps, a sequence consisting of a single value is produced, and this value is equal to the nal sum. The parallelism in an algorithm can yield improved performance on many di erent kinds of computers. For example, on a parallel computer, the operations in a parallel algorithm can be performed simultaneously by di erent processors. Furthermore, even on a single-processor computer the parallelism in an algorithm can be exploited by using multiple functional units, pipelined functional units, or pipelined memory systems. Thus, it is important to make a distinction between

[1]  Leslie M. Goldschlager,et al.  A unified approach to models of synchronous parallel machines , 1978, STOC.

[2]  Uzi Vishkin,et al.  A Parallel-Design Distributed-Implementation (PDDI) General-Purpose Computer , 2011, Theor. Comput. Sci..

[3]  V. Strassen Gaussian elimination is not optimal , 1969 .

[4]  Guy E. Blelloch,et al.  Connected components algorithms , 1995 .

[5]  Leslie G. Valiant,et al.  A logarithmic time sort for linear size networks , 1982, STOC.

[6]  Jon Louis Bentley,et al.  Multidimensional divide-and-conquer , 1980, CACM.

[7]  Yossi Matias,et al.  The QRQW PRAM: accounting for contention in parallel algorithms , 1994, SODA '94.

[8]  Guy E. Blelloch,et al.  An Experimental Analysis of Parallel Sorting Algorithms , 1998, Theory of Computing Systems.

[9]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[10]  F. Thomson Leighton,et al.  ARRAYS AND TREES , 1992 .

[11]  Philip N. Klein,et al.  A randomized linear-time algorithm to find minimum spanning trees , 1995, JACM.

[12]  Robert E. Tarjan,et al.  An Efficient Parallel Biconnectivity Algorithm , 2011, SIAM J. Comput..

[13]  L. Csanky,et al.  Fast parallel matrix inversion algorithms , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[14]  Russ Miller,et al.  Efficient Parallel Convex Hull Algorithms , 1988, IEEE Trans. Computers.

[15]  Selim G. Akl,et al.  Parallel computational geometry , 1992 .

[16]  Yuan-Chieh Chow,et al.  Optimal Parallel Sorting Scheme by Order Statistics , 1987, SIAM J. Comput..

[17]  Howard Jay Siegel,et al.  Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.) , 1985 .

[18]  Gary L. Miller,et al.  A Simple Randomized Parallel Algorithm for List-Ranking , 1990, Inf. Process. Lett..

[19]  A. Mullin,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[20]  Richard Cole,et al.  Finding minimum spanning forests in logarithmic time and linear work using random sampling , 1996, SPAA '96.

[21]  Uzi Vishkin,et al.  On Parallel Hashing and Integer Sorting , 1991, J. Algorithms.

[22]  Selim G. Akl,et al.  Design and analysis of parallel algorithms , 1985 .

[23]  Clyde P. Kruskal,et al.  Searching, Merging, and Sorting in Parallel Computation , 1983, IEEE Transactions on Computers.

[24]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[25]  David G. Kirkpatrick,et al.  The Ultimate Planar Convex Hull Algorithm? , 1986, SIAM J. Comput..

[26]  William F. McColl BSP Programming , 1994, Specification of Parallel Algorithms.

[27]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[28]  J. Reif,et al.  Parallel Tree Contraction Part 1: Fundamentals , 1989, Adv. Comput. Res..

[29]  Richard Cole,et al.  Parallel merge sort , 1988, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[30]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[31]  Uzi Vishkin,et al.  Finding the Maximum, Merging, and Sorting in a Parallel Computation Model , 1981, J. Algorithms.

[32]  Ramesh Subramonian,et al.  LogP: a practical model of parallel computation , 1996, CACM.

[33]  Gary L. Miller,et al.  Parallel Tree Contraction, Part 2: Further Applications , 1991, SIAM J. Comput..

[34]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[35]  Uri Zwick,et al.  An Optimal Randomised Logarithmic Time Connectivity Algorithm for the EREW PRAM , 1996, J. Comput. Syst. Sci..

[36]  Leslie M. Goldschlager,et al.  A universal interconnection pattern for parallel computers , 1982, JACM.

[37]  David Eppstein,et al.  Parallel Algorithmic Techniques for Combinatorial Computation , 1988, ICALP.

[38]  Harold S. Stone,et al.  Parallel Tridiagonal Equation Solvers , 1975, TOMS.

[39]  Thomas Lengauer VLSI Theory , 1990, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[40]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[41]  Guy E. Blelloch,et al.  Programming parallel algorithms , 1996, CACM.

[42]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[43]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[44]  Gary W. Sabot High performance computing: problem solving with parallel and vector architectures , 1995 .

[45]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[46]  James J. Little,et al.  Parallel Solutions to Geometric Problems in the Scan Model of Computation , 1994, J. Comput. Syst. Sci..

[47]  Philip N. Klein,et al.  A randomized linear-time algorithm for finding minimum spanning trees , 1994, STOC '94.

[48]  Jorge L. C. Sanz,et al.  The SIMD Model of Parallel Computation , 1994, Springer New York.

[49]  Torsten Suel,et al.  Towards efficiency and portability: programming with the BSP model , 1996, SPAA '96.

[50]  Tim J. Harris,et al.  A survey of PRAM simulation techniques , 1994, CSUR.

[51]  Guy E. Blelloch,et al.  A comparison of sorting algorithms for the connection machine CM-2 , 1991, SPAA '91.

[52]  Michael Ian Shamos,et al.  Divide-and-conquer in multidimensional space , 1976, STOC '76.

[53]  John Greiner,et al.  AD-A 270 551 A Comparison of Data-Parallel Algorithms for Connected Components , 1994 .

[54]  Timothy M. Chan,et al.  Output-sensitive construction of polytopes in four dimensions and clipped Voronoi diagrams in three , 1995, SODA '95.

[55]  J. S. Huang,et al.  Parallel sorting and data partitioning by sampling , 1983 .

[56]  Gary L. Miller,et al.  A new graph triconnectivity algorithm and its parallelization , 1992, Comb..

[57]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[58]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[59]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[60]  Harold S. Stone,et al.  A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.

[61]  Allan Gottlieb,et al.  Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.

[62]  J. Davenport Editor , 1960 .

[63]  Mark H. Overmars,et al.  Maintenance of configurations in the plane (revised edition) , 1981 .

[64]  Jeffrey D Ullma Computational Aspects of VLSI , 1984 .

[65]  Michael Luby,et al.  A simple parallel algorithm for the maximal independent set problem , 1985, STOC '85.

[66]  Mikhail J. Atallah,et al.  Efficient Parallel Solutions to Some Geometric Problems , 1986, J. Parallel Distributed Comput..

[67]  Jan van Leeuwen,et al.  Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity , 1994 .

[68]  Uzi Vishkin,et al.  An O(log n) Parallel Connectivity Algorithm , 1982, J. Algorithms.

[69]  John H. Reif,et al.  Synthesis of Parallel Algorithms , 1993 .

[70]  Leslie G. Valiant,et al.  General Purpose Parallel Architectures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[71]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[72]  K. Mani Chandy,et al.  Specification of Parallel Algorithms: DIMACS Workshop, May 9-11, 1994 , 1994 .

[73]  Uzi Vishkin,et al.  Parallel Ear Decomposition Search (EDS) and St-Numbering in Graphs (Extended Abstract) , 1986, Aegean Workshop on Computing.

[74]  Jan van Leeuwen,et al.  Maintenance of Configurations in the Plane , 1981, J. Comput. Syst. Sci..

[75]  Baruch Awerbuch,et al.  New Connectivity and MSF Algorithms for Shuffle-Exchange Network and PRAM , 1987, IEEE Transactions on Computers.

[76]  Russ Miller,et al.  Parallel algorithms for regular architectures - meshes and pyramids , 1996 .

[77]  Eli Upfal,et al.  Parallel hashing: an efficient implementation of shared memory , 1988, JACM.

[78]  Selim G. Akl Parallel computation: models and methods , 1997 .

[79]  Margaret Reid-Miller List Ranking and List Scan on the CRAY C90 , 1996, J. Comput. Syst. Sci..

[80]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[81]  Margaret Reid-Miller,et al.  List ranking and list scan on the Cray C-90 , 1994, SPAA '94.

[82]  Wojciech Rytter,et al.  Efficient parallel algorithms , 1988 .

[83]  Franco P. Preparata,et al.  An Improved Parallel Processor Bound in Fast Matrix Inversion , 1978, Inf. Process. Lett..

[84]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[85]  Walter J. Savitch,et al.  Time Bounded Random Access Machines with Parallel Processing , 1979, JACM.

[86]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[87]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[88]  Nicholas Pippenger,et al.  On simultaneous resource bounds , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[89]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[90]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[91]  James Christopher Wyllie,et al.  The Complexity of Parallel Computations , 1979 .

[92]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[93]  Guy E. Blelloch,et al.  Parallelism in sequential functional languages , 1995, FPCA '95.

[94]  Larry J. Stockmeyer,et al.  A Characterization of the Power of Vector Machines , 1976, J. Comput. Syst. Sci..

[95]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[96]  Richard Cole,et al.  Cascading Divide-and-Conquer: A Technique for Designing Parallel Algorithms , 1987, FOCS.

[97]  S. Lakshmivarahan,et al.  Parallel Sorting Algorithms , 1984, Adv. Comput..