On Optimal Algorithms for List Ranking in the Parallel External Memory Model with Applications to Treewidth and other Elementary Graph Problems

The performance of many algorithms on large input instances substantially depends on the number of triggered cache misses instead of the number of executed operations. This behavior is captured by the external memory model in a natural way. It models a computer by a fast cache of bounded size and a conceptually infinite (external) memory. In contrast to the classical RAMmodel, the complexity measure is the number of cache lines transferred between the cache and the memory. Computations on elements in the cache are not counted. Recent trends in processor design and advances in big data computing require massively parallel algorithms. The parallel external memory (PEM) model extends the external memory model so that it also captures parallelism. It consists of multiple processors which each have a private cache and share the (external) memory. This thesis considers three computational problems in the context of (parallel) external memory algorithms. For the fundamental problem of list ranking, previously, an algorithm was known that has sorting complexity for many settings of the PEM model. In the first part of this thesis, this algorithm is complemented by matching lower bounds for most practical settings. Interestingly, a stronger lower bound for parameter ranges which previously have not been considered is shown. By modeling how list ranking algorithms retrieve information on the structure of the list in the memory, we give a lower bound that is quadratic in sorting complexity for certain parameter settings. It is noteworthy that this result implies the first non-trivial lower bounds for list ranking for the bulk synchronous parallel and the MapReduce model. These lower bounds are complemented by a list ranking algorithm which is, in contrast to previous algorithms, analyzed for all parameter settings of the PEM model. In the second part, an efficient algorithm for the PEM model to compute a tree decomposition of bounded width for a graph is presented. The main challenge is to implement a load balancing strategy such that the running

[1]  Torben Hagerup,et al.  Parallel Algorithms with Optimal Speedup for Bounded Treewidth , 1995, SIAM J. Comput..

[2]  Paul D. Seymour,et al.  Graph Minors: XV. Giant Steps , 1996, J. Comb. Theory, Ser. B.

[3]  Celina M. H. de Figueiredo,et al.  SPLITTING NUMBER is NP-complete , 1998, Discret. Appl. Math..

[4]  Leslie M. Goldschlager,et al.  A unified approach to models of synchronous parallel machines , 1978, STOC.

[5]  Michael T. Goodrich,et al.  Fundamental parallel algorithms for private-cache chip multiprocessors , 2008, SPAA '08.

[6]  Robert E. Tarjan,et al.  An Efficient Parallel Biconnectivity Algorithm , 2011, SIAM J. Comput..

[7]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[8]  Rüdiger Reischuk,et al.  Exact time bounds for computing boolean functions on PRAMs without simultaneous writes , 1990, SPAA '90.

[9]  Paul Seymour,et al.  The metamathematics of the graph minor theorem , 1985 .

[10]  Ton Kloks,et al.  Efficient and Constructive Algorithms for the Pathwidth and Treewidth of Graphs , 1993, J. Algorithms.

[11]  Hans L. Bodlaender,et al.  Complexity of Path-Forming Games , 1993, Theor. Comput. Sci..

[12]  James Christopher Wyllie,et al.  The Complexity of Parallel Computations , 1979 .

[13]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[14]  Gary L. Miller,et al.  Deterministic parallel list ranking , 1988, Algorithmica.

[15]  Jörg Flum,et al.  Parameterized Complexity Theory , 2006, Texts in Theoretical Computer Science. An EATCS Series.

[16]  Ranjan K. Sen,et al.  O(log4 N) Time Parallel Maximal Matching Algorithm Using Linear Number of Processors , 2004, Parallel Algorithms Appl..

[17]  Lars Arge,et al.  The Buffer Tree: A Technique for Designing Batched External Data Structures , 2003, Algorithmica.

[18]  Michael A. Bender,et al.  Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model , 2007, SPAA '07.

[19]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[20]  Moni Naor,et al.  Fast parallel algorithms for chordal graphs , 1987, STOC '87.

[21]  Gero Greiner,et al.  Sparse Matrix Computations and their I/O Complexity , 2012 .

[22]  Tom White Hadoop - The Definitive Guide: MapReduce for the Cloud , 2009 .

[23]  Uzi Vishkin,et al.  Randomized speed-ups in parallel computation , 2015, STOC '84.

[24]  Larry Rudolph,et al.  The power of parallel prefix , 1985, IEEE Transactions on Computers.

[25]  Christian Komusiewicz,et al.  New Races in Parameterized Algorithmics , 2012, MFCS.

[26]  Hans L. Bodlaender A linear time algorithm for finding tree-decompositions of small treewidth , 1993, STOC '93.

[27]  Fedor V. Fomin,et al.  Planar F-Deletion: Approximation and Optimal FPT Algorithms , 2012, ArXiv.

[28]  Mike Paterson,et al.  Improved sorting networks withO(logN) depth , 1990, Algorithmica.

[29]  Stephen A. Cook,et al.  Upper and Lower Time Bounds for Parallel Random Access Machines without Simultaneous Writes , 1986, SIAM J. Comput..

[30]  Michael T. Goodrich,et al.  Parallel external memory graph algorithms , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[31]  Uzi Vishkin,et al.  Finding the Maximum, Merging, and Sorting in a Parallel Computation Model , 1981, J. Algorithms.

[32]  R. Ladner The circuit value problem is log space complete for P , 1975, SIGA.

[33]  Matthias Mnich,et al.  Treewidth Computation and Kernelization in the Parallel External Memory Model , 2014, IFIP TCS.

[34]  Torben Hagerup Simpler Linear-Time Kernelization for Planar Dominating Set , 2011, IPEC.

[35]  John Iacono,et al.  Using hashing to solve the dictionary problem , 2012, SODA.

[36]  Ulrich Meyer,et al.  Elementary Graph Algorithms in External Memory , 2002, Algorithms for Memory Hierarchies.

[37]  Richard Cole,et al.  Deterministic Coin Tossing with Applications to Optimal Parallel List Ranking , 2018, Inf. Control..

[38]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[39]  Annegret Liebers,et al.  Journal of Graph Algorithms and Applications Planarizing Graphs — a Survey and Annotated Bibliography , 2022 .

[40]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[41]  Andrew Chi-Chih Yao,et al.  Probabilistic computations: Toward a unified measure of complexity , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[42]  David G. Kirkpatrick,et al.  A Simple Parallel Tree Contraction Algorithm , 1989, J. Algorithms.

[43]  Lars Arge,et al.  The Buffer Tree: A New Technique for Optimal I/O-Algorithms (Extended Abstract) , 1995, WADS.

[44]  Rüdiger Reischuk,et al.  Exact Lower Time Bounds for Computing Boolean Functions on CREW PRAMs , 1994, J. Comput. Syst. Sci..

[45]  Uzi Vishkin,et al.  Parallel Ear Decomposition Search (EDS) and st-Numbering in Graphs , 1986, Theor. Comput. Sci..

[46]  Leslie G. Valiant,et al.  A fast parallel algorithm for routing in permutation networks , 1981, IEEE Transactions on Computers.

[47]  Richard Cole,et al.  Approximate Parallel Scheduling. Part I: The Basic Technique with Applications to Optimal Parallel List Ranking in Logarithmic Time , 1988, SIAM J. Comput..

[48]  Rolf Niedermeier,et al.  Linear-Time Computation of a Linear Problem Kernel for Dominating Set on Planar Graphs , 2011, IPEC.

[49]  Christophe Paul,et al.  Linear Kernels and Single-Exponential Algorithms Via Protrusion Decompositions , 2012, ICALP.

[50]  Manfred Kunde,et al.  A case against using Stirling's formula (unless you really need it) , 2003, Bull. EATCS.

[51]  János Komlós,et al.  An 0(n log n) sorting network , 1983, STOC.

[52]  Atsushi Takahashi,et al.  Minimal acyclic forbidden minors for the family of graphs with bounded path-width , 1994, Discret. Math..

[53]  Richard M. Karp,et al.  A Survey of Parallel Algorithms for Shared-Memory Machines , 1988 .

[54]  Michael A. Langston,et al.  obstruction Set Isolation for the Gate Matrix Layout Problem , 1994, Discret. Appl. Math..

[55]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[56]  Stefan Arnborg,et al.  Efficient algorithms for combinatorial problems on graphs with bounded decomposability — A survey , 1985, BIT.

[57]  Norbert Zeh,et al.  I/O-Efficient Algorithms for Graphs of Bounded Treewidth , 2001, SODA '01.

[58]  Qin Zhang,et al.  Sorting, Searching, and Simulation in the MapReduce Framework , 2011, ISAAC.

[59]  Carl A. Gunter,et al.  In handbook of theoretical computer science , 1990 .

[60]  Gary L. Miller,et al.  A Simple Randomized Parallel Algorithm for List-Ranking , 1990, Inf. Process. Lett..

[61]  Richard M. Karp,et al.  The complexity of parallel computation , 1986 .

[62]  Fedor V. Fomin,et al.  Planar F-Deletion: Approximation, Kernelization and Optimal FPT Algorithms , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[63]  Bruno Courcelle,et al.  Graph Structure and Monadic Second-Order Logic - A Language-Theoretic Approach , 2012, Encyclopedia of mathematics and its applications.

[64]  David G. Kirkpatrick,et al.  Parallel Construction of Subdivision Hierarchies , 1989, J. Comput. Syst. Sci..

[65]  Proceedings of the 10th Annual ACM Symposium on Theory of Computing, May 1-3, 1978, San Diego, California, USA , 1978, STOC.

[66]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[67]  Jörg Flum,et al.  Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[68]  Bruno Courcelle,et al.  The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs , 1990, Inf. Comput..

[69]  Afonso Ferreira,et al.  Efficient Parallel Graph Algorithms for Coarse-Grained Multicomputers and BSP , 2002, Algorithmica.

[70]  Riko Jacob,et al.  The Efficiency of MapReduce in Parallel External Memory , 2012, LATIN.

[71]  H. James Hoover,et al.  Limits to Parallel Computation: P-Completeness Theory , 1995 .

[72]  Hans L. Bodlaender,et al.  Dynamic Programming on Graphs with Bounded Treewidth , 1988, ICALP.

[73]  Gary L. Miller,et al.  Parallel tree contraction and its application , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[74]  E. Szemerédi,et al.  O(n LOG n) SORTING NETWORK. , 1983 .

[75]  J. Van Leeuwen,et al.  Handbook of theoretical computer science - Part A: Algorithms and complexity; Part B: Formal models and semantics , 1990 .

[76]  David S. Johnson,et al.  The Planar Hamiltonian Circuit Problem is NP-Complete , 1976, SIAM J. Comput..

[77]  Richard Cole,et al.  Approximate and exact parallel scheduling with applications to list, tree and graph problems , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[78]  Dimitrios M. Thilikos,et al.  (Meta) Kernelization , 2009, FOCS.

[79]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[80]  Neil Robertson,et al.  Graph Minors .XIII. The Disjoint Paths Problem , 1995, J. Comb. Theory B.

[81]  Stefan Arnborg,et al.  Linear time algorithms for NP-hard problems restricted to partial k-trees , 1989, Discret. Appl. Math..

[82]  Dimitrios M. Thilikos,et al.  Bidimensionality and kernels , 2010, SODA '10.

[83]  Nodari Sitchinava,et al.  On the Complexity of List Ranking in the Parallel External Memory Model , 2014, MFCS.

[84]  Allan Borodin,et al.  Routing, Merging, and Sorting on Parallel Models of Computation , 1985, J. Comput. Syst. Sci..

[85]  Stefan Arnborg,et al.  Efficient Algorithms for Combinatorial Problems with Bounded Decomposability - A Survey. , 1985 .

[86]  Richard Cole,et al.  Deterministic coin tossing and accelerating cascades: micro and macro techniques for designing parallel algorithms , 1986, STOC '86.

[87]  Paul D. Seymour,et al.  Graph Minors. XX. Wagner's conjecture , 2004, J. Comb. Theory B.

[88]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.