Parametrizable behavioral IP module for a data-localized low-power FFT

FFTs are important modules in embedded telecom systems, many of which require low-power real-time implementations. This paper describes a technique for aggressively localizing data accesses in a (inverse) fast Fourier transformation at the source code level. The global I/O functionality is not modified and neither is the bit-true arithmetic behavior. Typically 20 to 50% of the background memory accesses can be saved. A heavily parametrizable solution is proposed which leads to a family of power optimized algorithm codes. Moreover, efficient coding details for specific instances are shown.

[1]  Hugo De Man,et al.  High-level address optimization and synthesis techniques for data-transfer-intensive applications , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[2]  Hugo De Man,et al.  Array placement for storage size reduction in embedded multimedia systems , 1997, Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors.

[3]  W.F.J. Verhaegh,et al.  Allocation of multiport memories for hierarchical data streams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[4]  B.M. Baas An energy-efficient single-chip FFT processor , 1996, 1996 Symposium on VLSI Circuits. Digest of Technical Papers.

[5]  Earl E. Swartzlander,et al.  High Speed FFT Processor Implementation , 1984, MILCOM 1984 - IEEE Military Communications Conference.

[6]  Mit Press A Fast Fourier Transform Algorithm Using Base 8 Iterations , 1969 .

[7]  G. D. Bergland,et al.  A fast Fourier transform algorithm using base 8 iterations , 1968 .

[8]  Teresa H. Meng,et al.  Portable video-on-demand in wireless communication , 1995, Proc. IEEE.

[9]  Hugo De Man,et al.  Loop transformation methodology for fixed-rate video, image and telecom processing applications , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[10]  John V. McCanny,et al.  Synthesisable FFT cores , 1997, 1997 IEEE Workshop on Signal Processing Systems. SiPS 97 Design and Implementation formerly VLSI Signal Processing.

[11]  J. O'Brien,et al.  A 200 MIPS single-chip 1 k FFT processor , 1989, IEEE International Solid-State Circuits Conference, 1989 ISSCC. Digest of Technical Papers.

[12]  Hugo De Man,et al.  Efficient microcoded processor design for fixed rate DFT and FFT , 1990, J. VLSI Signal Process..

[13]  Clemente Rodríguez,et al.  Evaluation of the optimal strategy for managing the register file , 1990 .

[14]  Lars Wanhammar,et al.  Design of an 128-point FFT processor for OFDM applications , 1996, Proceedings of Third International Conference on Electronics, Circuits, and Systems.

[15]  Mark Horowitz,et al.  Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.

[16]  Francky Catthoor,et al.  Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .

[17]  Anantha P. Chandrakasan,et al.  Low-power Signal Processing Systems , 1992, Workshop on VLSI Signal Processing.

[18]  Margaret Martonosi,et al.  Characterizing the Memory Behavior of Compiler-Parallelized Applications , 1996, IEEE Trans. Parallel Distributed Syst..

[19]  John G. McWhirter,et al.  A systolic implementation of the Winograd Fourier transform algorithm , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Hugo De Man,et al.  Memory Size Reduction Through Storage Order Optimization for Embedded Parallel Multimedia Applications , 1997, Parallel Comput..

[21]  Shousheng He,et al.  Design and implementation of a 1024-point pipeline FFT processor , 1998, Proceedings of the IEEE 1998 Custom Integrated Circuits Conference (Cat. No.98CH36143).