A gmp-based implementation of schönhage-strassen's large integer multiplication algorithm

Schönhage-Strassen's algorithm is one of the best known algorithms for multiplying large integers. Implementing it ef?ciently is of utmost importance, since many other algorithms rely on it as a subroutine. We present here an improved implementation, based on the one distributed within the GMP library. The following ideas and techniques were used or tried: faster arithmetic modulo 2n + 1, improved cache locality, Mersenne transforms, Chinese Remainder Reconstruction, the √2 trick, Harley's and Granlund's tricks, improved tuning.

[1]  B. Fagin,et al.  Discrete weighted transforms and large-integer arithmetic , 1994 .

[2]  Bruce Dodson,et al.  20 Years of ECM , 2006, ANTS.

[3]  Joris van der Hoeven The truncated fourier transform and applications , 2004, ISSAC '04.

[4]  Steven G. Johnson,et al.  The Fastest Fourier Transform in the West , 1997 .

[5]  C. Pomerance,et al.  Prime Numbers: A Computational Perspective , 2002 .

[6]  Daniel J. BERNSTEINf REMOVING REDUNDANCY IN HIGH-PRECISION NEWTON ITERATION , 2004 .

[7]  David H. Bailey,et al.  The Computation of π to 29,360,000 Decimal Digits Using Borweins’ Quartically Convergent Algorithm , 1988 .

[8]  V. Rich Personal communication , 1989, Nature.

[9]  R. Brent,et al.  Factorization of the eighth Fermat number , 1981 .

[10]  Colin Percival,et al.  Rapid multiplication modulo the sum and difference of highly composite numbers , 2003, Math. Comput..

[11]  D. J. Bernstein Fast multiplication and its applications , 2008 .

[12]  Sartaj Sahni,et al.  Analysis of algorithms , 2000, Random Struct. Algorithms.

[13]  J. S. Gage The great Internet Mersenne prime search. , 1998, M.D. computing : computers in medical practice.

[14]  H. De Man,et al.  Parametrizable behavioral IP module for a data-localized low-power FFT , 1999, 1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461).

[15]  Alan H. Karp,et al.  High-precision division and square root , 1997, TOMS.

[16]  David H. Bailey,et al.  FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[17]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[18]  Paul Walton Purdom,et al.  The Analysis of Algorithms , 1995 .

[19]  Gage Js,et al.  The great Internet Mersenne prime search. , 1998 .

[20]  Allan Borodin,et al.  Fast Modular Transforms via Division , 1972, SWAT.

[21]  Arnold Schönhage,et al.  Schnelle Berechnung von Kettenbruchentwicklungen , 1971, Acta Informatica.

[22]  Marco Bodrato,et al.  What About Toom-Cook Matrices Optimality ? , 2006 .

[23]  R. Gregory Taylor,et al.  Modern computer algebra , 2002, SIGA.