论文信息 - Optimal VLSI complexity design for high speed pipeline FFT using RNS

Optimal VLSI complexity design for high speed pipeline FFT using RNS

In this study a logic design, based on RNS units, to perform the N point FFT on a continuous data stream is proposed, and its performance is evaluated in terms of asymptotic VLSI complexity. Such a structure is based on quadratic residue number systems (QRNS), which allow a simplified processing of complex numbers. It is known in the literature that a lower bound on complexity for a single instance of the FFT problem is A(N)T2(N)=Ω(N2 log2 N), and that optimal constructive designs were proposed with T(N)=ϑ(log2 N). The architecture proposed here is based on a very high degree of processing parallelism and on a communication parallelism tailored to the response time of adders and multipliers used in the design; furthermore, pipelining data it performs as a single FFT instance optimal design and features an upper bound for the mean service time Tm(N)=ϑ(log log log N) for each FFT instance. The approach has been also applied in the design of a structure performing 1024-point FFT, with 23 bit data.

Giuseppe Alia | Enrico Martinelli | G. Alia | E. Martinelli

[1] Bongiovanni,et al. A VLSI Network for Variable Size FFT's , 1983, IEEE Transactions on Computers.

[2] Fred J. Taylor,et al. A Radix-4 FFT Using Complex RNS Arithmetic , 1985, IEEE Transactions on Computers.

[3] Fayez El Guibaly,et al. VLSI design of an FFT processor network , 1989, Integr..

[4] Giuseppe Alia,et al. On the Lower Bound to the VLSI Complexity of Number Conversion from Weighted to Residue Representation , 1993, IEEE Trans. Computers.

[5] Richard I. Tanaka,et al. Residue arithmetic and its applications to computer technology , 1967 .

[6] Alan Norton,et al. Parallelization and Performance Analysis of the Cooley–Tukey FFT Algorithm for Shared-Memory Architectures , 1987, IEEE Transactions on Computers.

[7] Giancarlo Bongiovanni. Two VLSI Structures for the Discrete Fourier Transform , 1983, IEEE Transactions on Computers.

[8] Michael A. Soderstrand,et al. The concept of a Quadratic-Like " Complex Residue , 2002 .

[9] Earl E. Swartzlander,et al. A radix-8 wafer scale FFT processor , 1992, J. VLSI Signal Process..

[10] Giuseppe Alia,et al. A VLSI Modulo m Multiplier , 1991, IEEE Trans. Computers.

[11] Franco P. Preparata,et al. Area-Time Optimal VLSI Networks for Computing Integer Multiplications and Discrete Fourier Transform , 1981, ICALP.

[12] Jacques Verly,et al. An algorithm for distributed computation of FFTs , 1987 .

[13] F J Taylor,et al. Comparison of DFT algorithms using a residue architecture , 1981 .

[14] Fred J. Taylor,et al. A Residue Arithmetic Implementation of the FFT , 1987, J. Parallel Distributed Comput..

[15] Alan R. Jones,et al. Fast Fourier Transform , 1970, SIGP.

[16] Giuseppe Alia,et al. A fast near optimum VLSI implementation of FFT using residue number systems , 1984, Integr..

[17] C. Thomborson,et al. A Complexity Theory for VLSI , 1980 .

[18] H. T. Kung,et al. A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.

[19] R. Jarocki,et al. Modular architecture for high performance implementation of FFT algorithm , 1986, ISCA 1986.

[20] Ramdas Kumaresan,et al. Fast Base Extension Using a Redundant Modulus in RNS , 1989, IEEE Trans. Computers.

[21] K.-W. Shin,et al. A VLSI architecture for parallel computation of FFT , 1990 .

[22] Franco P. Preparata,et al. The cube-connected-cycles: A versatile network for parallel computation , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).