Low-power application-specific processor for FFT computations

In this paper, we describe a processor architecture tailored for radix-4 and mixed-radix FFT algorithms, which have lower arithmetic complexity than radix-2 algorithms. The processor is based on transport triggered architecture and several optimizations have been used to improve the energy-efficiency. The processor has been synthesized on a 130nm standard cell technology and analysis show that a programmable solution can possess energy-efficiency comparable to a fixed-function ASIC.

[1]  Chin-Long Wey,et al.  Design of cost-efficient memory-based FFT processors using single-port memories , 2007, 2007 IEEE International SOC Conference.

[2]  Sheac Yee Lim,et al.  Implementing FFT in an FPGA Co-Processor , 2004 .

[3]  C. K. Yuen,et al.  Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Enrico Macii,et al.  Energy-performance tradeoffs for the shared memory in multi-processor systems-on-chip , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[5]  James Douglas,et al.  Designing a 3 GHz, 130 nm, Intel Pentium 4 processor , 2002 .

[6]  Shuenn-Shyang Wang,et al.  An Area-Efficient Design of Variable-Length Fast Fourier Transform Processor , 2008, J. Signal Process. Syst..

[7]  Chin-Long Wey,et al.  High-speed, Low Cost Parallel Memory-Based FFT Processors for OFDM Applications , 2007, 2007 14th IEEE International Conference on Electronics, Circuits and Systems.

[8]  David Akopian,et al.  A family of scalable FFT architectures and an implementation of 1024-point radix-2 FFT for real-time communications , 2008, 2008 IEEE International Conference on Computer Design.

[9]  Jarmo Takala,et al.  Parallel Memory Architecture for Application-Specific Instruction-Set Processors , 2009, J. Signal Process. Syst..

[10]  Jarmo Takala,et al.  Low-Power Twiddle Factor Unit for FFT Computation , 2007, SAMOS.

[11]  Chen-Yi Lee,et al.  A dynamic scaling FFT processor for DVB-T applications , 2004 .

[12]  Tughrul Arslan,et al.  High‐Performance Low‐Power FFT Cores , 2008 .

[13]  Myung Hoon Sunwoo,et al.  SPOCS: Application Specific Signal Processor for OFDM Communication Systems , 2008, J. Signal Process. Syst..

[14]  Michael Conner,et al.  Recursive fast algorithm and the role of the tensor product , 1992, IEEE Trans. Signal Process..

[15]  J. Heikkinen,et al.  Transport Triggered Architecture Processor for Mixed-Radix FFT , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[16]  Xiaojin Li,et al.  A Low Power and Small Area FFT Processor for OFDM Demodulator , 2007, IEEE Transactions on Consumer Electronics.

[17]  Yu-Wei Lin,et al.  A dynamic scaling FFT processor for DVB-T applications , 2004, IEEE Journal of Solid-State Circuits.

[18]  Tughrul Arslan,et al.  A low-power and domain-specific reconfigurable FFT fabric for system-on-chip applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[19]  Henk Corporaal Microprocessor architectures - from VLIW to TTA , 1997 .

[20]  William J. Dally,et al.  A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[21]  Sau-Gee Chen,et al.  Design of an efficient variable-length FFT processor , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[22]  Quanyuan Feng,et al.  ASIC design of low-power reconfigurable FFT processor , 2007, 2007 7th International Conference on ASIC.

[23]  In-Cheol Park,et al.  Balanced Binary-Tree Decomposition for Area-Efficient Pipelined FFT Processing , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[24]  Pei-Yun Tsai,et al.  Low-power variable-length fast Fourier transform processor , 2005 .

[25]  A. Chandrakasan,et al.  A 180-mV subthreshold FFT processor using a minimum energy design methodology , 2005, IEEE Journal of Solid-State Circuits.

[26]  Bevan M. Baas,et al.  A low-power, high-performance, 1024-point FFT processor , 1999, IEEE J. Solid State Circuits.

[27]  Jarmo Takala,et al.  Effects of program compression , 2006, J. Syst. Archit..

[28]  Earl E. Swartzlander,et al.  Contention-free switch-based implementation of 1024-point Radix-2 Fourier Transform Engine , 2007, 2007 25th International Conference on Computer Design.

[29]  Jarmo Takala,et al.  Low-Power, High-Performance TTA Processor for 1024-Point Fast Fourier Transform , 2006, SAMOS.

[30]  D. Cohen Simplified control of FFT hardware , 1976 .