Multithreaded parallel implementation of arithmetic operations modulo a triangular set

We discuss the parallelization of arithmetic operations on polynomials modulo a triangular set. We focus on parallel normal form computations since this is a core subroutine in many high-level algorithms, such as triangular decompositions of polynomial systems. When computing modulo a triangular set, multivariate polynomials are regarded recursively as univariate ones, which leads to several implementation challenges when one targets highly efficient code. We rely on an algorithm proposed in [17] which addresses some of these issues. We propose a two-level parallel scheme. First, we make use of parallel multidimensional Fast Fourier Transform in order to perform multivariate polynomial multiplication. Secondly, we extract parallelism from the structure of the sequential normal form algorithm of [17]. We have realized a multithreaded implementation. We report on different strategies for the management of tasks and threads.

[1]  Akimasa Morihata,et al.  Automatic inversion generates divide-and-conquer parallel programs , 2007, PLDI '07.

[2]  Laurent Imbert,et al.  Parallel Montgomery multiplication in GF(2/sup k/) using trinomial residue arithmetic , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[3]  Hans-Wolfgang Loidl,et al.  Parallel Computation of Modular Multivariate Polynominal Resultants on a Shared Memory Machine , 1994, CONPAR.

[4]  Sung-Ho Hwang,et al.  Parallel Modular Multiplication Algorithm in Residue Number System , 2003, PPAM.

[5]  R. Al Na'mneh,et al.  Communication efficient adaptive matrix transpose algorithm for FFT on symmetric multiprocessors , 2005, Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, 2005. SSST '05..

[6]  H. T. Kung On computing reciprocals of power series , 1974 .

[7]  Ç. Koç,et al.  Parallel Multiplication in GF(2k) using Polynomial Residue Arithmetic , 2000 .

[8]  S. Cook,et al.  ON THE MINIMUM COMPUTATION TIME OF FUNCTIONS , 1969 .

[9]  R. Gregory Taylor,et al.  Modern computer algebra , 2002, SIGA.

[10]  Marc Moreno Maza,et al.  Fast arithmetic for triangular sets: from theory to practice , 2007, ISSAC '07.

[11]  Joris van der Hoeven The truncated fourier transform and applications , 2004, ISSAC '04.

[12]  Marc Moreno Maza,et al.  Efficient Implementation of Polynomial Arithmetic in a Multiple-Level Programming Environment , 2006, ICMS.

[13]  Marc Moreno Maza,et al.  On the Virtues of Generic Programming for Symbolic Computation , 2007, International Conference on Computational Science.

[14]  Jeremy R. Johnson,et al.  Architecture-aware classical Taylor shift by 1 , 2005, ISSAC.

[15]  Marc Moreno Maza,et al.  Component-level parallelization of triangular decompositions , 2007, PASCO '07.

[16]  Laurent Imbert,et al.  Parallel Montgomery Multiplication in GF(2k) using Trinomial Residue Arithmetic , 2004, IACR Cryptology ePrint Archive.

[17]  Marc Moreno Maza,et al.  Lifting techniques for triangular decompositions , 2005, ISSAC.

[18]  Stephen M. Watt,et al.  Multiprocessed parallelism support in ALDOR on SMPs and multicores , 2007, PASCO '07.

[19]  Sébastien Varrette,et al.  Probabilistic certification of divide & conquer algorithms on global computing platforms: application to fault-tolerant exact matrix-vector product , 2007, PASCO '07.

[20]  Matteo Frigo,et al.  An analysis of dag-consistent distributed shared-memory algorithms , 1996, SPAA '96.

[21]  Malte Sieveking An algorithm for division of powerseries , 2005, Computing.

[22]  Marc Moreno Maza,et al.  Implementation techniques for fast polynomial arithmetic in a high-level programming environment , 2006, ISSAC '06.

[23]  Wolfgang Küchlin,et al.  On the multi-threaded computation of integral polynomial greatest common divisors , 1991, ISSAC '91.

[24]  Marc Moreno Maza,et al.  On computer-assisted classification of coupled integrable equations , 2001, ISSAC '01.