A task-parallel algorithm for distinct degree factorization (DDF) is considered. The recent DDF algorithms consist of coarse and fine DDF steps, and utilize the asymptotically fast algorithms for various po~ynomial manipulations, including binary-tree multiplication, Chinese remaindering, multipoint evaluation and modular composition. Some basic techniques for parallelization are summarized and applied to these component algorithms, with no arithmetic operation of univariate polynomials parallelized. More significantly considered is the scheduling of the computation steps, and one of the major findings is that, from the viewpoint of time complexity, the fine DDF steps can be concealed by the coarse ones. Finally, we present a complete description of our new parallel algorithm of practical use, and show that it can perform DDF of a univariate polynomial of degree n over a finite field of q elements in time O(Af(n)logq + (114(n3\2) + nl/2M14(nlf2))l ogn) on n112 processors using 0(n3i2) space, where Al(n) and &f&f(k) denote the costs for multiplications of polynomials of degree n and of k x k matrices, respectively.
[1]
Keith O. Geddes,et al.
Algorithms for computer algebra
,
1992
.
[2]
Erich Kaltofen,et al.
Subquadratic-time factoring of polynomials over finite fields
,
1998,
Math. Comput..
[3]
Hirokazu Murao,et al.
Modular Algorithm for Sparse Multivariate Polynomial Interpolation and its Parallel Implementation
,
1996,
J. Symb. Comput..
[4]
Allan Borodin,et al.
The computational complexity of algebraic and numeric problems
,
1975,
Elsevier computer science library.
[5]
H. T. Kung,et al.
Fast Algorithms for Manipulating Formal Power Series
,
1978,
JACM.
[6]
Alfred V. Aho,et al.
The Design and Analysis of Computer Algorithms
,
1974
.