论文信息 - Work efficient parallel solution of Toeplitz systems and polynomial GCD

Work efficient parallel solution of Toeplitz systems and polynomial GCD

A matrix A = [atj] is Toeplztz if a,,j = ai+k,j+k for each k where the matrix elements are defined. We define an n x n matrix to have displacement rank (disp-rank) 6 if it can be written as the sum of 6 terms, where each term is either (i) the product of a lower triangular Toeplitz matrix and an upper triangular Toeplitz matrix or (ii) the product of an upper triangular Toeplitz matrix and a lower triangular Toeplitz matrix. There are known efficient sequential algorithms [BA80,BGY80] for inverse, determinant, linear system solution, factorization, and finding the rank for the case of Toeplitz matrices and matrices of bounded disp-rank, but there are no such results for efficient parallel algorithms. We assume the input matrices have entries that are either integers with a polynomial number of bits or rational numbers expressed as a ratio of integers with a polynomial number of bits. We do not make any other assumption about the input. We assume the arithmetic PRAM model of parallel comput at ion< In this paper, we show that certain structured linear systems can be solved exactly and efficiently in parallel, dropping these processor bounds to nearly linear, wzthout significant slowdown. We give much improved parallel algorithms for the exact solution and factorization, determinant, inverse, and finding rank of various structured matrices: in particular Toeplitz and matrices of bounded disp-rank. We apply this result to efficient randomized parallel algorithms for the following problems in the same parallel time and processor bounds: (1) polynomial greatest common divisors (GCD) and extended GCD, (2) polynomial resultant, (3) Pad6 approximants of rational functions, and (4) shift register synthesis and BCH decoding problems, (5) Sturm sequences and real root isolation. We are the first to give parallel algorithms for these problems with polylog time with linear processors. Previously, the best parallel al orithms Y [PR87,P88,Pa90] for these problems required fl(log n) time with n2 / log n processors (or O (log”f 1J n) time using at least “Address: Department of Computer Science, Duke University, Durham, NC 27708-0129; E-mail: reifQcs.duke.edu. Work also done in part during sabbatical at School of CS, CMU. Supported by DARPA/ISTO Grant NOO014-91-J-1985, Subcontract KI-92-01-0182 of DARPA/ISTO prime Contract NOO014-92-C-01S2, and NSF Grant NSF-IRI-91-00681. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyri ht notice and the $“”” title of the publication and, is date appear, an nohce ISgwen that copym~is by~rm,swon of the Assoc@ionof Computing Machinery. o cop otherwise, or to repubhsh, requires a fee and/or speci ic permiswon. STOC’ 95, Las Vegas, Nevada, USA @ 1995 ACM 0-89791 -718-9/95/0005..$3 .50 Q(n2/ logo(l) n) processors), whereas the best sequential time was O(n log2 n). In this paper, we describe our parallel algorithm for structured linear systems of bounded displacement rank which costs time O(log2 n) using n(log n)W processors where w = 2.376. Our results drop b~ a nearly linear factor the best previous processor bounds for polylog time parallel algorithms for all these problems, and our results are within a polylog factor of work compaired to the best sequential work bounds of O(n logz n). All our computations require bit precision O(n(@ + log n)), which is the asymptotically optimal bit precision for ,L~> log n since the determinant, exact LU factorization and matrix inverse require bit precision at least Q(nfl).

John H. Reif | J. Reif

[1] L. Ljung,et al. Extended Levinson and Chandrasekhar equations for general discrete-time linear estimation problems , 1978 .

[2] Tricia Walker,et al. Computer science , 1996, English for academic purposes series.

[3] Victor Y. Pan,et al. The Parallel Computation of Minimum Cost Paths in Graphs by Stream Contraction , 1991, Inf. Process. Lett..

[4] W. Gragg,et al. The Padé Table and Its Relation to Certain Algorithms of Numerical Analysis , 1972 .

[5] Bruce Ronald. Musicus,et al. Levinson and fast Choleski algorithms for Toeplitz and almost Toeplitz matrices , 1988 .

[6] Victor Y. Pan,et al. Processor efficient parallel solution of linear systems over an abstract field , 1991, SPAA '91.

[7] H. Hotelling. Some New Methods in Matrix Calculation , 1943 .

[8] Victor Y. Pan,et al. Fast and Efficient Parallel Solution of Sparse Linear Systems , 1993, SIAM J. Comput..

[9] Erich Kaltofen,et al. On Wiedemann's Method of Solving Sparse Linear Systems , 1991, AAECC.

[10] V. Pan,et al. Polynomial and Matrix Computations , 1994, Progress in Theoretical Computer Science.

[11] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.