Parallel distinct degree factorization algorithm

A task-parallel algorithm for distinct degree factorization (DDF) is considered. The recent DDF algorithms consist of coarse and fine DDF steps, and utilize the asymptotically fast algorithms for various po~ynomial manipulations, including binary-tree multiplication, Chinese remaindering, multipoint evaluation and modular composition. Some basic techniques for parallelization are summarized and applied to these component algorithms, with no arithmetic operation of univariate polynomials parallelized. More significantly considered is the scheduling of the computation steps, and one of the major findings is that, from the viewpoint of time complexity, the fine DDF steps can be concealed by the coarse ones. Finally, we present a complete description of our new parallel algorithm of practical use, and show that it can perform DDF of a univariate polynomial of degree n over a finite field of q elements in time O(Af(n)logq + (114(n3\2) + nl/2M14(nlf2))l ogn) on n112 processors using 0(n3i2) space, where Al(n) and &f&f(k) denote the costs for multiplications of polynomials of degree n and of k x k matrices, respectively.