A family of scalable FFT architectures and an implementation of 1024-point radix-2 FFT for real-time communications

The paper presents a family of architectures for FFT implementation based on the decomposition of the perfect shuffle permutation, which can be designed with variable number of processing elements. This provides designers with a trade-off choice of speed vs. complexity (cost and area.). A detailed case study is provided on the implementation of 1024-point FFT with 2 processing elements using 45 nm process technology, including area, timing, power and place-and-route results.

[1]  D. Cohen Simplified control of FFT hardware , 1976 .

[2]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[3]  Bevan M. Baas,et al.  A low-power, high-performance, 1024-point FFT processor , 1999, IEEE J. Solid State Circuits.

[4]  Jaakko Astola,et al.  Multistage interconnection networks for parallel Viterbi decoders , 2003, IEEE Trans. Commun..

[5]  Lars Wanhammar,et al.  A hardware efficient control of memory addressing for high-performance FFT processors , 2000, IEEE Trans. Signal Process..

[6]  C. K. Yuen,et al.  Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  E.E. Swartzlander,et al.  A contention-free Radix-2 8k-point fast Fourier transform engine using single port SRAMs , 2008, IEEE SoutheastCon 2008.

[8]  E. V. Jones,et al.  A pipelined FFT processor for word-sequential data , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Lewis Johnson,et al.  Conflict free memory addressing for dedicated FFT hardware , 1992 .

[10]  H.L. Groginsky,et al.  A Pipeline Fast Fourier Transform , 1970, IEEE Transactions on Computers.

[11]  Shousheng He,et al.  Design and implementation of a 1024-point pipeline FFT processor , 1998, Proceedings of the IEEE 1998 Custom Integrated Circuits Conference (Cat. No.98CH36143).

[12]  Earl E. Swartzlander,et al.  Contention-free switch-based implementation of 1024-point Radix-2 Fourier Transform Engine , 2007, 2007 25th International Conference on Computer Design.

[13]  Yutai Ma,et al.  An effective memory addressing scheme for FFT processors , 1999, IEEE Trans. Signal Process..

[14]  David Akopian Systematic approaches to parallel architectures for DSP algorithms , 1997, Signal Process..

[15]  Alvin M. Despain,et al.  Pipeline and Parallel-Pipeline FFT Processors for VLSI Implementations , 1984, IEEE Transactions on Computers.

[16]  Francisco Argüello,et al.  A VLSI Constant Geometry Architecture for the Fast Hartley and Fourier Transforms , 1992, IEEE Trans. Parallel Distributed Syst..

[17]  Francisco Argüello,et al.  Application-specific architecture for fast transforms based on the successive doubling method , 1993, IEEE Trans. Signal Process..

[18]  J. O'Brien,et al.  A 200 MIPS single-chip 1 k FFT processor , 1989, IEEE International Solid-State Circuits Conference, 1989 ISSCC. Digest of Technical Papers.

[19]  Marshall C. Pease,et al.  Organization of Large Scale Fourier Processors , 1969, J. ACM.

[20]  Earl E. Swartzlander,et al.  A radix-8 wafer scale FFT processor , 1992, J. VLSI Signal Process..

[21]  Surendar S. Magar,et al.  An application specific DSP chip set for 100 MHz data rates , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.