Work efficient parallel solution of Toeplitz systems and polynomial GCD

A matrix A = [atj] is Toeplztz if a,,j = ai+k,j+k for each k where the matrix elements are defined. We define an n x n matrix to have displacement rank (disp-rank) 6 if it can be written as the sum of 6 terms, where each term is either (i) the product of a lower triangular Toeplitz matrix and an upper triangular Toeplitz matrix or (ii) the product of an upper triangular Toeplitz matrix and a lower triangular Toeplitz matrix. There are known efficient sequential algorithms [BA80,BGY80] for inverse, determinant, linear system solution, factorization, and finding the rank for the case of Toeplitz matrices and matrices of bounded disp-rank, but there are no such results for efficient parallel algorithms. We assume the input matrices have entries that are either integers with a polynomial number of bits or rational numbers expressed as a ratio of integers with a polynomial number of bits. We do not make any other assumption about the input. We assume the arithmetic PRAM model of parallel comput at ion< In this paper, we show that certain structured linear systems can be solved exactly and efficiently in parallel, dropping these processor bounds to nearly linear, wzthout significant slowdown. We give much improved parallel algorithms for the exact solution and factorization, determinant, inverse, and finding rank of various structured matrices: in particular Toeplitz and matrices of bounded disp-rank. We apply this result to efficient randomized parallel algorithms for the following problems in the same parallel time and processor bounds: (1) polynomial greatest common divisors (GCD) and extended GCD, (2) polynomial resultant, (3) Pad6 approximants of rational functions, and (4) shift register synthesis and BCH decoding problems, (5) Sturm sequences and real root isolation. We are the first to give parallel algorithms for these problems with polylog time with linear processors. Previously, the best parallel al orithms Y [PR87,P88,Pa90] for these problems required fl(log n) time with n2 / log n processors (or O (log”f 1J n) time using at least “Address: Department of Computer Science, Duke University, Durham, NC 27708-0129; E-mail: reifQcs.duke.edu. Work also done in part during sabbatical at School of CS, CMU. Supported by DARPA/ISTO Grant NOO014-91-J-1985, Subcontract KI-92-01-0182 of DARPA/ISTO prime Contract NOO014-92-C-01S2, and NSF Grant NSF-IRI-91-00681. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyri ht notice and the $“”” title of the publication and, is date appear, an nohce ISgwen that copym~is by~rm,swon of the Assoc@ionof Computing Machinery. o cop otherwise, or to repubhsh, requires a fee and/or speci ic permiswon. STOC’ 95, Las Vegas, Nevada, USA @ 1995 ACM 0-89791 -718-9/95/0005..$3 .50 Q(n2/ logo(l) n) processors), whereas the best sequential time was O(n log2 n). In this paper, we describe our parallel algorithm for structured linear systems of bounded displacement rank which costs time O(log2 n) using n(log n)W processors where w = 2.376. Our results drop b~ a nearly linear factor the best previous processor bounds for polylog time parallel algorithms for all these problems, and our results are within a polylog factor of work compaired to the best sequential work bounds of O(n logz n). All our computations require bit precision O(n(@ + log n)), which is the asymptotically optimal bit precision for ,L~> log n since the determinant, exact LU factorization and matrix inverse require bit precision at least Q(nfl).

[1]  L. Ljung,et al.  Extended Levinson and Chandrasekhar equations for general discrete-time linear estimation problems , 1978 .

[2]  Tricia Walker,et al.  Computer science , 1996, English for academic purposes series.

[3]  Victor Y. Pan,et al.  The Parallel Computation of Minimum Cost Paths in Graphs by Stream Contraction , 1991, Inf. Process. Lett..

[4]  W. Gragg,et al.  The Padé Table and Its Relation to Certain Algorithms of Numerical Analysis , 1972 .

[5]  Bruce Ronald. Musicus,et al.  Levinson and fast Choleski algorithms for Toeplitz and almost Toeplitz matrices , 1988 .

[6]  Victor Y. Pan,et al.  Processor efficient parallel solution of linear systems over an abstract field , 1991, SPAA '91.

[7]  H. Hotelling Some New Methods in Matrix Calculation , 1943 .

[8]  Victor Y. Pan,et al.  Fast and Efficient Parallel Solution of Sparse Linear Systems , 1993, SIAM J. Comput..

[9]  Erich Kaltofen,et al.  On Wiedemann's Method of Solving Sparse Linear Systems , 1991, AAECC.

[10]  V. Pan,et al.  Polynomial and Matrix Computations , 1994, Progress in Theoretical Computer Science.

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  Victor Y. Pan,et al.  Fast and efficient parallel solution of dense linear systems , 1989 .

[13]  Michael Ben-Or,et al.  Simple algorithms for approximating all roots of a polynomial with real roots , 1990, J. Complex..

[14]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[15]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[16]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[17]  Georg Heinig,et al.  Algebraic Methods for Toeplitz-like Matrices and Operators , 1984 .

[18]  Joseph F. Traub,et al.  On Euclid's Algorithm and the Theory of Subresultants , 1971, JACM.

[19]  Fred G. Gustavson,et al.  Analysis of the Berlekamp-Massey Linear Feedback Shift-Register Synthesis Algorithm , 1976, IBM J. Res. Dev..

[20]  Thomas Kailath,et al.  Generalized Gohberg-Semencul Formulas for Matrix Inversion , 1989 .

[21]  W. Gragg,et al.  Superfast solution of real positive definite toeplitz systems , 1988 .

[22]  Thomas Kailath,et al.  Linear complexity parallel algorithms for linear systems of equations with recursive structure , 1987 .

[23]  Adi Ben-Israel,et al.  A note on an iterative method for generalized inversion of matrices , 1966 .

[24]  H. Hotelling Further Points on Matrix Calculation and Simultaneous Equations , 1943 .

[25]  W. F. Trench An Algorithm for the Inversion of Finite Toeplitz Matrices , 1964 .

[26]  Victor Y. Pan,et al.  Efficient parallel solution of linear systems , 1985, STOC '85.

[27]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[28]  Ephraim Feig,et al.  A Fast Parallel Algorithm for Determining all Roots of a Polynomial with Real Roots , 1988, SIAM J. Comput..

[29]  V. Pan On computations with dense structured matrices , 1990 .

[30]  Allan Borodin,et al.  The computational complexity of algebraic and numeric problems , 1975, Elsevier computer science library.

[31]  George E. Collins Polynomial Remainder Sequences and Determinants , 1966 .

[32]  M. Morf,et al.  Displacement ranks of matrices and linear equations , 1979 .

[33]  Thomas Kailath,et al.  Divide-and-conquer solutions of least-squares problems for matrices with displacement structure , 1991 .

[34]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[35]  B. Anderson,et al.  Greatest common divisor via generalized Sylvester and Bezout matrices , 1978 .

[36]  Adi Ben-Israel,et al.  On Iterative Computation of Generalized Inverses and Associated Projections , 1966 .

[37]  B. Anderson,et al.  Asymptotically fast solution of toeplitz and related systems of linear equations , 1980 .

[38]  Elwyn R. Berlekamp,et al.  Algebraic coding theory , 1984, McGraw-Hill series in systems science.

[39]  H. Padé Sur la représentation approchée d'une fonction par des fractions rationnelles , 1892 .

[40]  John H. Reif,et al.  Parallel Output-Sensitive Algorithms for Combinatorial and Linear Algebra Problems , 2001, J. Comput. Syst. Sci..

[41]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[42]  Robert T. Moenck,et al.  Approximate algorithms to derive exact solutions to systems of linear equations , 1979, EUROSAM.

[43]  David Y. Y. Yun,et al.  Fast Solution of Toeplitz Systems of Equations and Computation of Padé Approximants , 1980, J. Algorithms.

[44]  V. Pan PARAMETRIZATION OF NEWTON'S ITERATION FOR COMPUTATIONS WITH STRUCTURED MATRICES AND APPLICATIONS , 1992 .

[45]  Jacob T. Schwartz,et al.  Fast Probabilistic Algorithms for Verification of Polynomial Identities , 1980, J. ACM.

[46]  Victor Y. Pan,et al.  Complexity of Parallel Matrix Computations , 1987, Theor. Comput. Sci..

[47]  John H. Reif,et al.  O(log2 n) time efficient parallel factorization of dense, sparse separable, and banded matrices , 1994, SPAA '94.

[48]  John H. Reif,et al.  Synthesis of Parallel Algorithms , 1993 .