Introduction to Algorithms, Second Edition

problems To understand the class of polynomial-time solvable problems, we must first have a formal notion of what a "problem" is. We define an abstract problem Q to be a binary relation on a set I of problem instances and a set S of problem solutions. For example, an instance for SHORTEST-PATH is a triple consisting of a graph and two vertices. A solution is a sequence of vertices in the graph, with perhaps the empty sequence denoting that no path exists. The problem SHORTEST-PATH itself is the relation that associates each instance of a graph and two vertices with a shortest path in the graph that connects the two vertices. Since shortest paths are not necessarily unique, a given problem instance may have more than one solution. This formulation of an abstract problem is more general than is required for our purposes. As we saw above, the theory of NP-completeness restricts attention to decision problems: those having a yes/no solution. In this case, we can view an abstract decision problem as a function that maps the instance set I to the solution set {0, 1}. For example, a decision problem related to SHORTEST-PATH is the problem PATH that we saw earlier. If i = G, u, v, k is an instance of the decision problem PATH, then PATH(i) = 1 (yes) if a shortest path from u to v has at most k edges, and PATH(i) = 0 (no) otherwise. Many abstract problems are not decision problems, but rather optimization problems, in which some value must be minimized or maximized. As we saw above, however, it is usually a simple matter to recast an optimization problem as a decision problem that is no harder. Encodings If a computer program is to solve an abstract problem, problem instances must be represented in a way that the program understands. An encoding of a set S of abstract objects is a mapping e from S to the set of binary strings. For example, we are all familiar with encoding the natural numbers N = {0, 1, 2, 3, 4,...} as the strings {0, 1, 10, 11, 100,...}. Using this encoding, e(17) = 10001. Anyone who has looked at computer representations of keyboard characters is familiar with either the ASCII or EBCDIC codes. In the ASCII code, the encoding of A is 1000001. Even a compound object can be encoded as a binary string by combining the representations of its constituent parts. Polygons, graphs, functions, ordered pairs, programs-all can be encoded as binary strings. Thus, a computer algorithm that "solves" some abstract decision problem actually takes an encoding of a problem instance as input. We call a problem whose instance set is the set of binary strings a concrete problem. We say that an algorithm solves a concrete problem in time O(T (n)) if, when it is provided a problem instance i of length n = |i|, the algorithm can produce the solution in O(T (n)) time. A concrete problem is polynomial-time solvable, therefore, if there exists an algorithm to solve it in time O(n) for some constant k. We can now formally define the complexity class P as the set of concrete decision problems that are polynomial-time solvable. We can use encodings to map abstract problems to concrete problems. Given an abstract decision problem Q mapping an instance set I to {0, 1}, an encoding e : I → {0, 1}* can be used to induce a related concrete decision problem, which we denote by e(Q). If the solution to an abstract-problem instance i I is Q(i) {0, 1}, then the solution to the concreteproblem instance e(i) {0, 1}* is also Q(i). As a technicality, there may be some binary strings that represent no meaningful abstract-problem instance. For convenience, we shall assume that any such string is mapped arbitrarily to 0. Thus, the concrete problem produces the same solutions as the abstract problem on binary-string instances that represent the encodings of abstract-problem instances. We would like to extend the definition of polynomial-time solvability from concrete problems to abstract problems by using encodings as the bridge, but we would like the definition to be independent of any particular encoding. That is, the efficiency of solving a problem should not depend on how the problem is encoded. Unfortunately, it depends quite heavily on the encoding. For example, suppose that an integer k is to be provided as the sole input to an algorithm, and suppose that the running time of the algorithm is Θ(k). If the integer k is provided in unary-a string of k 1's-then the running time of the algorithm is O(n) on length-n inputs, which is polynomial time. If we use the more natural binary representation of the integer k, however, then the input length is n = ⌊lg k⌋ + 1. In this case, the running time of the algorithm is Θ (k) = Θ(2), which is exponential in the size of the input. Thus, depending on the encoding, the algorithm runs in either polynomial or superpolynomial time. The encoding of an abstract problem is therefore quite important to our under-standing of polynomial time. We cannot really talk about solving an abstract problem without first specifying an encoding. Nevertheless, in practice, if we rule out "expensive" encodings such as unary ones, the actual encoding of a problem makes little difference to whether the problem can be solved in polynomial time. For example, representing integers in base 3 instead of binary has no effect on whether a problem is solvable in polynomial time, since an integer represented in base 3 can be converted to an integer represented in base 2 in polynomial time. We say that a function f : {0, 1}* → {0,1}* is polynomial-time computable if there exists a polynomial-time algorithm A that, given any input x {0, 1}*, produces as output f (x). For some set I of problem instances, we say that two encodings e1 and e2 are polynomially related if there exist two polynomial-time computable functions f12 and f21 such that for any i I , we have f12(e1(i)) = e2(i) and f21(e2(i)) = e1(i). That is, the encoding e2(i) can be computed from the encoding e1(i) by a polynomial-time algorithm, and vice versa. If two encodings e1 and e2 of an abstract problem are polynomially related, whether the problem is polynomial-time solvable or not is independent of which encoding we use, as the following lemma shows. Lemma 34.1 Let Q be an abstract decision problem on an instance set I , and let e1 and e2 be polynomially related encodings on I . Then, e1(Q) P if and only if e2(Q) P. Proof We need only prove the forward direction, since the backward direction is symmetric. Suppose, therefore, that e1(Q) can be solved in time O(nk) for some constant k. Further, suppose that for any problem instance i, the encoding e1(i) can be computed from the encoding e2(i) in time O(n) for some constant c, where n = |e2(i)|. To solve problem e2(Q), on input e2(i), we first compute e1(i) and then run the algorithm for e1(Q) on e1(i). How long does this take? The conversion of encodings takes time O(n), and therefore |e1(i)| = O(n), since the output of a serial computer cannot be longer than its running time. Solving the problem on e1(i) takes time O(|e1(i)|) = O(n), which is polynomial since both c and k are constants. Thus, whether an abstract problem has its instances encoded in binary or base 3 does not affect its "complexity," that is, whether it is polynomial-time solvable or not, but if instances are encoded in unary, its complexity may change. In order to be able to converse in an encoding-independent fashion, we shall generally assume that problem instances are encoded in any reasonable, concise fashion, unless we specifically say otherwise. To be precise, we shall assume that the encoding of an integer is polynomially related to its binary representation, and that the encoding of a finite set is polynomially related to its encoding as a list of its elements, enclosed in braces and separated by commas. (ASCII is one such encoding scheme.) With such a "standard" encoding in hand, we can derive reasonable encodings of other mathematical objects, such as tuples, graphs, and formulas. To denote the standard encoding of an object, we shall enclose the object in angle braces. Thus, G denotes the standard encoding of a graph G. As long as we implicitly use an encoding that is polynomially related to this standard encoding, we can talk directly about abstract problems without reference to any particular encoding, knowing that the choice of encoding has no effect on whether the abstract problem is polynomial-time solvable. Henceforth, we shall generally assume that all problem instances are binary strings encoded using the standard encoding, unless we explicitly specify the contrary. We shall also typically neglect the distinction between abstract and concrete problems. The reader should watch out for problems that arise in practice, however, in which a standard encoding is not obvious and the encoding does make a difference. A formal-language framework One of the convenient aspects of focusing on decision problems is that they make it easy to use the machinery of formal-language theory. It is worthwhile at this point to review some definitions from that theory. An alphabet Σ is a finite set of symbols. A language L over Σ is any set of strings made up of symbols from Σ. For example, if Σ = {0, 1}, the set L = {10, 11, 101, 111, 1011, 1101, 10001,...} is the language of binary representations of prime numbers. We denote the empty string by ε, and the empty language by Ø. The language of all strings over Σ is denoted Σ*. For example, if Σ = {0, 1}, then Σ* = {ε, 0, 1, 00, 01, 10, 11, 000,...} is the set of all binary strings. Every language L over Σ is a subset of Σ*. There are a variety of operations on languages. Set-theoretic operations, such as union and intersection, follow directly from the set-theoretic definitions. We define the complement of L by . The concatenation of two languages L1 and L2 is the language L = {x1x2 : x1 L1 and x2 L2}. The closure or Kleene star of a language L is the language L*= {ε} L L L ···, where Lk is the language obtained by

[1]  H. Whitney On the Abstract Properties of Linear Dependence , 1935 .

[2]  E. T. An Introduction to the Theory of Numbers , 1946, Nature.

[3]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[4]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[5]  E. F. Moore,et al.  Variable-length binary encodings , 1959 .

[6]  Stephen Warshall,et al.  A Theorem on Boolean Matrices , 1962, JACM.

[7]  R. W. Floyd Algorithm 245: Treesort , 1964, CACM.

[8]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics - Journal Canadien de Mathematiques.

[9]  A. J. Maria A Remark on Stirling's Formula , 1965 .

[10]  Stephen A. Cook,et al.  Review: Alan Cobham, Yehoshua Bar-Hillel, The Intrinsic Computational Difficulty of Functions , 1969 .

[11]  V. Strassen Gaussian elimination is not optimal , 1969 .

[12]  S. Winograd ON THE ALGEBRAIC COMPLEXITY OF FUNCTIONS , 1970 .

[13]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[14]  C. L. Liu,et al.  Introduction to Combinatorial Mathematics. , 1971 .

[15]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[16]  Fanica Gavril,et al.  Algorithms for Minimum Coloring, Maximum Clique, Minimum Covering by Cliques, and Maximum Independent Set of a Chordal Graph , 1972, SIAM J. Comput..

[17]  Ronald L. Graham,et al.  An Efficient Algorithm for Determining the Convex Hull of a Finite Planar Set , 1972, Inf. Process. Lett..

[18]  Robert W. Floyd,et al.  Permuting Information in Idealized Two-Level Storage , 1972, Complexity of Computer Computations.

[19]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[20]  Edward M. Reingold,et al.  Binary search trees of bounded balance , 1972, SIAM J. Comput..

[21]  Jeffrey D. Ullman,et al.  Worst-case analysis of memory allocation algorithms , 1972, STOC.

[22]  Ray A. Jarvis,et al.  On the Identification of the Convex Hull of a Finite Set of Points in the Plane , 1973, Inf. Process. Lett..

[23]  Manuel Blum,et al.  Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[24]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[25]  A. V. Karzanov,et al.  Determining the maximal flow in a network by the method of preflows , 1974 .

[26]  Gary L. Miller,et al.  Riemann's Hypothesis and tests for primality , 1975, STOC.

[27]  László Lovász,et al.  On the ratio of optimal integral and fractional covers , 1975, Discret. Math..

[28]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[29]  Oscar H. Ibarra,et al.  Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems , 1975, JACM.

[30]  K. Chung,et al.  Elementary Probability Theory with Stochastic Processes. , 1975 .

[31]  J. Pollard A monte carlo method for factorization , 1975 .

[32]  Peter van Emde Boas,et al.  Preserving order in a forest in less than logarithmic time , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[33]  Michael L. Fredman,et al.  New Bounds on the Complexity of the Shortest Path Problem , 1976, SIAM J. Comput..

[34]  Donald E. Knuth,et al.  Big Omicron and big Omega and big Theta , 1976, SIGA.

[35]  Teofilo F. Gonzalez,et al.  P-Complete Approximation Problems , 1976, J. ACM.

[36]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[37]  Elwood S. Buffa,et al.  Graph Theory with Applications , 1977 .

[38]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[39]  Allen Van Gelder,et al.  Computer Algorithms: Introduction to Design and Analysis , 1978 .

[40]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[41]  Mark R. Brown,et al.  Implementation and Analysis of Binomial Queue Algorithms , 1978, SIAM J. Comput..

[42]  Robert Sedgewick,et al.  Implementing Quicksort programs , 1978, CACM.

[43]  R. E. Bellman,et al.  Review: Eugene L. Lawler, Combinatorial optimization: networks and matroids , 1978 .

[44]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[45]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[46]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[47]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[48]  Richard P. Brent,et al.  An improved Monte Carlo factorization algorithm , 1980 .

[49]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[50]  László Lovász,et al.  Mathematical Structures Underlying Greedy Algorithms , 1981, International Symposium on Fundamentals of Computation Theory.

[51]  Andrew Chi-Chih Yao,et al.  A Lower Bound to Finding Convex Hulls , 1981, JACM.

[52]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[53]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[54]  Michael Ben-Or,et al.  Lower bounds for algebraic computation trees , 1983, STOC.

[55]  Don H. Johnson,et al.  Gauss and the history of the fast Fourier transform , 1984, IEEE ASSP Magazine.

[56]  János Komlós Linear Verification for Spanning Trees , 1984, FOCS.

[57]  Jan van Leeuwen,et al.  Worst-case Analysis of Set Union Algorithms , 1984, JACM.

[58]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[59]  B. Korte,et al.  Greedoids - A Structural Framework for the Greedy Algorithm , 1984 .

[60]  J. Crabbe Wilf: Algorithms and Complexity , 1986 .

[61]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[62]  Robert E. Tarjan,et al.  Efficient algorithms for finding minimum spanning trees in undirected and directed graphs , 1986, Comb..

[63]  William H. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[64]  Herbert Edelsbrunner,et al.  Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.

[65]  Micha Hofri,et al.  Probabilistic Analysis of Algorithms , 1987, Texts and Monographs in Computer Science.

[66]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[67]  Prabhakar Raghavan,et al.  Randomized rounding: A technique for provably good algorithms and algorithmic proofs , 1985, Comb..

[68]  Frank Thomson Leighton,et al.  An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[69]  Robert E. Tarjan,et al.  Relaxed heaps: an alternative to Fibonacci heaps with applications to parallel computation , 1988, CACM.

[70]  Silvio Micali,et al.  A Digital Signature Scheme Secure Against Adaptive Chosen-Message Attacks , 1988, SIAM J. Comput..

[71]  Friedhelm Meyer auf der Heide,et al.  Dynamic perfect hashing: upper and lower bounds , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[72]  Robert E. Tarjan,et al.  Network Flow Algorithms , 1989 .

[73]  Robert E. Tarjan,et al.  Improved Time Bounds for the Maximum Flow Problem Improved Time Bounds for the Maximum Flow Problem Improved Time Bounds for the Maximum Flow Problem , 2008 .

[74]  William Pugh,et al.  Skip Lists: A Probabilistic Alternative to Balanced Trees , 1989, WADS.

[75]  Michael E. Saks,et al.  The cell probe complexity of dynamic data structures , 1989, STOC '89.

[76]  Noga Alon,et al.  Generating Pseudo-Random Permutations and Maximum Flow Algorithms , 1990, Inf. Process. Lett..

[77]  Kurt Mehlhorn,et al.  Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity , 1990 .

[78]  Kurt Mehlhorn,et al.  Faster algorithms for the shortest path problem , 1990, JACM.

[79]  D. Willard,et al.  Trans-dichotomous algorithms for minimum spanning trees and shortest paths , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[80]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[81]  Mark Allen Weiss,et al.  Data structures and algorithm analysis in C , 1991 .

[82]  David W. Krumme,et al.  Gossiping in Minimal Time , 1992, SIAM J. Comput..

[83]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .

[84]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[85]  Dexter Kozen,et al.  The Design and Analysis of Algorithms , 1991, Texts and Monographs in Computer Science.

[86]  Robert E. Tarjan,et al.  A faster deterministic maximum flow algorithm , 1992, SODA '92.

[87]  Robert E. Tarjan,et al.  Verification and Sensitivity Analysis of Minimum Spanning Trees in Linear Time , 1992, SIAM J. Comput..

[88]  J. Pollard Factoring with cubic integers , 1993 .

[89]  Thomas H. Cormen,et al.  Virtual memory for data-parallel computing , 1993 .

[90]  David B. Shmoys,et al.  Computing near-optimal solutions to combinatorial optimization problems , 1994, Combinatorial Optimization.

[91]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[92]  Arne Andersson,et al.  Balanced Search Trees Made Simple , 1993, WADS.

[93]  Chao Lu,et al.  Mathematics of Multidimensional Fourier Transform Algorithms , 1993 .

[94]  Steven J. Phillips,et al.  Online load balancing and network flow , 1993, STOC.

[95]  Prabhakar Raghavan Randomized Approximation Algorithms in Combinatorial Optimization , 1994, FSTTCS.

[96]  Andrew V. Goldberg,et al.  Shortest paths algorithms: Theory and experimental evaluation , 1994, SODA '94.

[97]  Joseph O'Rourke,et al.  Computational Geometry in C. , 1995 .

[98]  Monika Henzinger,et al.  Fully dynamic biconnectivity and transitive closure , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[99]  Uri Zwick,et al.  Selecting the median , 1995, SODA '95.

[100]  Raimund Seidel,et al.  On the All-Pairs-Shortest-Path Problem in Unweighted Undirected Graphs , 1995, J. Comput. Syst. Sci..

[101]  Nimrod Megiddo,et al.  Improved algorithms and analysis for secretary problems and generalizations , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[102]  Philip N. Klein,et al.  A randomized linear-time algorithm to find minimum spanning trees , 1995, JACM.

[103]  Rajeev Raman,et al.  Sorting in linear time? , 1995, STOC '95.

[104]  Mike Paterson,et al.  Progress in Selection , 1996, SWAT.

[105]  J. R. Johnson,et al.  Implementation of Strassen's Algorithm for Matrix Multiplication , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[106]  Arne Andersson Faster deterministic sorting and searching in linear space , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[107]  Satish Rao,et al.  Computing vertex connectivity: new bounds from old techniques , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[108]  David P. Williamson,et al.  Primal-Dual Approximation Algorithms for Integral Flow and Multicut in Trees, with Applications to Matching and Set Cover , 1993, ICALP.

[109]  Susan R. Wilson INTRODUCTION TO COMPUTATIONAL BIOLOGY: MAPS, SEQUENCES AND GENOMES. , 1996 .

[110]  Dorit S. Hochba,et al.  Approximation Algorithms for NP-Hard Problems , 1997, SIGA.

[111]  Salvador Roura An Improved Master Theorem for Divide-and-Conquer Recurrences , 1997, ICALP.

[112]  Andrew V. Goldberg,et al.  Buckets, heaps, lists, and monotone priority queues , 1997, SODA '97.

[113]  Rakesh M. Verma General Techniques for Analyzing Recursive Algorithms with Applications , 1997, SIAM J. Comput..

[114]  Rajeev Raman,et al.  Recent results on the single-source shortest paths problem , 1997, SIGA.

[115]  Mikkel Thorup,et al.  Faster deterministic sorting and priority queues in linear space , 1998, SODA '98.

[116]  J. Ward,et al.  Book Review: Proceedings of the Third International Conference on Spectral and High Order Methods@@@Book Review: An introduction to computational geometry for curves and surfaces@@@Book Review: The mathematics of surfaces@@@Book Review: Algorithmic number theory, Volume I: Efficient algorithms , 1998 .

[117]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[118]  Hans Jürgen Prömel,et al.  Lectures on Proof Verification and Approximation Algorithms , 1998, Lecture Notes in Computer Science.

[119]  Yinyu Ye,et al.  Interior point algorithms: theory and analysis , 1997 .

[120]  Sanjeev Arora,et al.  Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems , 1998, JACM.

[121]  Sanjeev Arora,et al.  The approximability of NP-hard problems , 1998, STOC '98.

[122]  M. Henzinger,et al.  Randomized fully dynamic graph algorithms with polylogarithmic time per operation , 1999, JACM.

[123]  M. Douglas McIlroy A Killer Adversary for Quicksort , 1999, Softw. Pract. Exp..

[124]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[125]  Uri Zwick,et al.  All pairs shortest paths in undirected graphs with integer weights , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[126]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[127]  Bernard Chazelle,et al.  A minimum spanning tree algorithm with inverse-Ackermann type complexity , 2000, JACM.

[128]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[129]  Harold N. Gabow,et al.  Path-based depth-first search for strong and biconnected components , 2000, Inf. Process. Lett..

[130]  Yijie Han,et al.  Improved fast integer sorting in linear space , 2001, SODA '01.

[131]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[132]  Matt Green The Distribution of Pseudoprimes , 2003 .