A Parallel Hardware Architecture for fast Gaussian Elimination over GF(2)

This paper presents a hardware-optimized variant of the well-known Gaussian elimination over GF(2) and its highly efficient implementation. The proposed hardware architecture can solve any regular and (uniquely solvable) overdetermined linear system of equations (LSE) and is not limited to matrices of a certain structure. Besides solving LSEs, the architecture at hand can also accomplish the related problem of matrix inversion extremely fast. Its average running time for n times n binary matrices with uniformly distributed entries equals 2n (clock cycles) as opposed to about frac14n3 in software. The average running time remains very close to 2n for matrices with densities much greater or lower than 0.5. The architecture has a worst-case time complexity of O(n2) and also a space complexity of O(n2). With these characteristics the architecture is particularly suited to efficiently solve medium-sized LSEs as they for example appear in the cryptanalysis of certain stream cipher classes. Moreover, we propose a hardware-optimized algorithm for matrix-by-matrix multiplication over GF(2) which runs in linear time and quadratic space on a similar architecture. This opens up the possibility of building a more complex architecture for efficiently solving larger LSEs by means of Strassen's algorithm which could significantly improve the time complexity of algebraic attacks on various ciphers. As proof-of-concept we realized our architecture on a contemporary low-cost FPGA. The implementation for a 50 times 50 LSE can be clocked with a frequency of up to 300 MHz and computes the solution in 0.33 mus on average

[1]  Frederik Armknecht,et al.  Improving Fast Algebraic Attacks , 2004, FSE.

[2]  David H. Bailey,et al.  Using Strassen's algorithm to accelerate the solution of linear systems , 1991, The Journal of Supercomputing.

[3]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[4]  Eli Biham,et al.  Instant Ciphertext-Only Cryptanalysis of GSM Encrypted Communication , 2003, CRYPTO.

[5]  Dinesh Manocha,et al.  LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[6]  V. Strassen Gaussian elimination is not optimal , 1969 .

[7]  Nicolas Courtois Fast Algebraic Attacks on Stream Ciphers with Linear Feedback , 2003, CRYPTO.

[8]  Nicolas Courtois Cryptanalysis of Sfinks , 2005, ICISC.

[9]  Andreas Meister,et al.  Numerik linearer Gleichungssysteme , 1999 .

[10]  Andrew M. Odlyzko,et al.  Solving Large Sparse Linear Systems over Finite Fields , 1990, CRYPTO.

[11]  Frederik Armknecht,et al.  Introducing a New Variant of Fast Algebraic Attacks and Minimizing Their Successive Data Complexity , 2005, Mycrypt.

[12]  Willi Meier,et al.  Fast Algebraic Attacks on Stream Ciphers with Linear Feedback , 2003, CRYPTO.

[13]  Marvin C. Wunderlich,et al.  A compact algorithm for Gaussian elimination over GF(2) implemented on highly parallel computers , 1984, Parallel Comput..

[14]  David H. Bailey,et al.  A Strassen-Newton algorithm for high-speed parallelizable matrix inversion , 1988, Proceedings. SUPERCOMPUTING '88.

[15]  Daniel J. Bernstein,et al.  Circuits for Integer Factorization: A Proposal , 2001 .

[16]  Frederik Armknecht A Linearization Attack on the Bluetooth Key Stream Generator , 2002, IACR Cryptol. ePrint Arch..

[17]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[18]  Nicolas Courtois Algebraic Attacks on Combiners with Memory and Several Outputs , 2003, ICISC.

[19]  Adi Shamir,et al.  Scalable Hardware for Sparse Systems of Linear Equations, with Applications to Integer Factorization , 2005, CHES.