Implementation of Parallel LFSR-based Applications on an Adaptive DSP featuring a Pipelined Configurable Gate Array

Linear feedback shift registers (LFSRs) are common structures in many application fields, including cryptography, digital broadcasting and communication. High- throughput requirements need highly parallel implementations, usually accomplished in state of the art system on chips (SoCs) with application specific coprocessors. Although this approach achieves the required performance, it rapidly shows lack of flexibility when those devices are proposed, as an example, for multi-standard modems or for security applications in which run-time update can provide added value. This paper shows the implementation of parallel LFSR-based applications on an embedded adaptive DSP featuring a Pipelined Configurable Gate Array (PiCoGA). With respect to standard embedded FPGAs, pipelined devices usually provide better performance, e.g. in terms of speed, but they commonly show the undeniable drawback of additional design constraints. As a test-case, we consider the implementation of the 32-bit CRC used in the Ethernet standard that achieves on the target architecture up to ~25Gbit/sec throughput, with a parallel LFSR processing 128 bit at time, which is comparable to the performance offered by some ASIC devices.

[1]  Subhadeep Roy A sub-word-parallel Galois field multiply-accumulate unit for digital signal processors , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[2]  Jeff H. Derby,et al.  High-speed CRC computation using state-space transformations , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[3]  Ming-Der Shieh,et al.  High-speed CRC design for 10 Gbps applications , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[4]  Charles A. Zukowski,et al.  High-speed parallel CRC circuits in VLSI , 1992, IEEE Trans. Commun..

[5]  Jürgen Becker,et al.  Reconfigurable processor architectures for mobile phones , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[6]  Riccardo Sisto,et al.  Parallel CRC generation , 1990, IEEE Micro.

[7]  C. Kennedy,et al.  High-speed parallel CRC circuits , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[8]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[9]  Keshab K. Parhi,et al.  Interleaved cyclic redundancy check (CRC) code , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[10]  H. Michael Ji,et al.  Fast parallel CRC algorithm and implementation on a configurable processor , 2002, 2002 IEEE International Conference on Communications. Conference Proceedings. ICC 2002 (Cat. No.02CH37333).

[11]  Seth Copen Goldstein,et al.  A High-Performance Flexible Architecture for Cryptography , 1999, CHES.