Dense Linear Algebra over Finite Fields: the FFLAS and FFPACK packages

In the last past two decades, several efforts have been made to reduce exact lin ear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide efficient implementations of such algorithms one need to be careful w ith the underlying arithmetic. It is well know that modular technique such as Chinese remainder algorithm or p -adic lifting allow in practice to achieve better performances especially when word size arithmetic are used. Therefore, finite field arithmetics becomes an important core for efficient exac t linear algebra libraries. In this paper we study different implementations of finite field in order to ach ieve efficiency for basic linear algebra routines such as dot product or matrix multiplication; our goal being to provide an exact alterna te to numerical BLAS library. Following matrix multiplication reductions, our kernel has many symbolic linear algebra applications: symbolic triangularization, system solving, exact determin ant computation and matrix inversion are then studied and we demonstrate the efficie ncy of these reductions in practice.

[1]  Jean-Guillaume Dumas,et al.  Efficient polynomial time algorithms computing industrial-strength primitive roots , 2004, Inf. Process. Lett..

[2]  Jean-Guillaume Dumas,et al.  Finite field linear algebra subroutines , 2002, ISSAC '02.

[3]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[4]  Hans Zassenhaus,et al.  A remark on the Hensel factorization method , 1978 .

[5]  Harald Niederreiter,et al.  Introduction to finite fields and their applications: List of Symbols , 1986 .

[6]  Erich Kaltofen,et al.  LINBOX: A GENERIC LIBRARY FOR EXACT LINEAR ALGEBRA , 2002 .

[7]  J. R. Johnson,et al.  Implementation of Strassen's Algorithm for Matrix Multiplication , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[8]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[9]  David Defour,et al.  Fonctions élémentaires : algorithmes et implémentations efficaces pour l'arrondi correct en double précision. (Elementary functions : algorithms and efficient implementation for correct rounding for the double precision) , 2003 .

[10]  Jean-Guillaume Dumas,et al.  Adaptive Triangular System Solving , 2006, Challenges in Symbolic Computation Software.

[11]  Jean-Guillaume Dumas,et al.  On parallel block algorithms for exact triangularizations , 2002, Parallel Comput..

[12]  Oscar H. Ibarra,et al.  A Generalization of the Fast LUP Matrix Decomposition Algorithm and Applications , 1982, J. Algorithms.

[13]  Jean-Guillaume Dumas,et al.  Adaptive and Hybrid Algorithms: classification and illustration on triangular system solving ∗ , 2006 .

[14]  Jean-Guillaume Dumas,et al.  Efficient computation of the characteristic polynomial , 2005, ISSAC.

[15]  B. David Saunders Black box methods for least squares problems , 2001, ISSAC '01.

[16]  Claude-Pierre Jeannerod,et al.  On the complexity of polynomial matrix computations , 2003, ISSAC '03.

[17]  G. Mullen,et al.  Primitive polynomials over finite fields , 1992 .

[18]  Jean-Guillaume Dumas,et al.  Algorithmes parallèles efficaces pour le calcul formel : algèbre linéaire creuse et extensions algébriques. (Efficient parallel algorithm for computer algebra : sparce linear algebra and algebraic extensions) , 2000 .

[19]  Jean-Guillaume Dumas,et al.  FFPACK: finite field linear algebra package , 2004, ISSAC '04.

[20]  Jean-Guillaume Dumas,et al.  On Efficient Sparse Integer Matrix Smith Normal Form Computations , 2001, J. Symb. Comput..

[21]  Joachim von zur Gathen,et al.  Modern Computer Algebra , 1998 .

[22]  Pierre Courrieu,et al.  Fast Computation of Moore-Penrose Inverse Matrices , 2008, ArXiv.

[23]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[24]  Andrew M. Odlyzko,et al.  Discrete Logarithms: The Past and the Future , 2000, Des. Codes Cryptogr..

[25]  Victor Y. Pan,et al.  Fast rectangular matrix multiplications and improving parallel matrix computations , 1997, PASCO '97.

[26]  Josef Schicho Proceedings of the 2004 international symposium on Symbolic and algebraic computation , 2004, ISSAC 2004.

[27]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[28]  J. Hopcroft,et al.  Triangular Factorization and Inversion by Fast Matrix Multiplication , 1974 .

[29]  James Demmel,et al.  LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.

[30]  Yasuhiro KAWAME Hirokazu MBLAS : Modular Basic Linear Algebra Subprograms , Design and Speedup Techniques , 2004 .

[31]  Jean-Guillaume Dumas,et al.  Q-adic transform revisited , 2007, ISSAC '08.

[32]  Gene H. Golub,et al.  Matrix computations , 1983 .

[33]  Julian D. Laderman,et al.  On practical algorithms for accelerated matrix multiplication , 1992 .

[34]  Peter L. Montgomery,et al.  A Block Lanczos Algorithm for Finding Dependencies Over GF(2) , 1995, EUROCRYPT.

[35]  Erich Kaltofen,et al.  On the complexity of computing determinants , 2001, computational complexity.

[36]  Ben Noble A Method for Computing the Generalized Inverse of a Matrix , 1966 .

[37]  Klaus Huber Solving equations in finite fields and some results concerning the structure of GF(pm) , 1992, IEEE Trans. Inf. Theory.

[38]  Michael A. Heroux,et al.  GEMMW: A Portable Level 3 BLAS Winograd Variant of Strassen's Matrix-Matrix Multiply Algorithm , 1994, Journal of Computational Physics.

[39]  Arne Storjohann,et al.  The shifted number system for fast linear algebra on integer matrices , 2005, J. Complex..

[40]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[41]  Victor Eijkhout,et al.  Self-Adapting Numerical Software and Automatic Tuning of Heuristics , 2003, International Conference on Computational Science.

[42]  Isak Jonsson,et al.  Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms , 1998, PARA.

[43]  V. Strassen Gaussian elimination is not optimal , 1969 .

[44]  Erich Kaltofen,et al.  Parallel algorithms for matrix normal forms , 1990 .

[45]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[46]  V. Pan,et al.  Polynomial and matrix computations (vol. 1): fundamental algorithms , 1994 .

[47]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[48]  Klaus Huber Some comments on Zech's logarithms , 1990, IEEE Trans. Inf. Theory.

[49]  R. Gregory Taylor,et al.  Modern computer algebra , 2002, SIGA.

[50]  Numerische Mathematik Exact Solution of Linear Equations Using P-Adie Expansions* , 2005 .

[51]  Igor E. Kaporin,et al.  The aggregation and cancellation techniques as a practical tool for faster matrix multiplication , 2004, Theor. Comput. Sci..