Domain Specific Reconfigurable Processing Core Architecture for Digital Filtering Applications

This paper presents a reconfigurable processing core architecture targeted for digital filtering applications. The architecture can be configured to execute linear phase FIR filter, DLMS adaptive FIR filter, (I)FFT, and 2D-(I)DCT with high performance and low energy consumption by reducing heavy routing resources used extensively in other reconfigurable architectures. The pipeline depth of the multipliers in the processing core is locally controlled so that power consumption is reduced by minimizing unnecessary register switching is saved. We have shown that the proposed processing core consumes less energy and has better or comparable performance than that of the existing reconfigurable architectures proposed in academia and industry, that have been tailored for these applications. The circuit is designed in 0.35-μm CMOS processing technology with 3.3 V supply voltage.

[1]  Reconfigurable low energy multiplier for multimedia system design , 2000, Proceedings IEEE Computer Society Workshop on VLSI 2000. System Design for a System-on-Chip Era.

[2]  Peter Kabal,et al.  The Stability of Adaptive Minimum Mean Square Error Equalizers Using Delayed Adjustment , 1983, IEEE Trans. Commun..

[3]  TessierRussell,et al.  Reconfigurable Computing for Digital Signal Processing , 2001 .

[4]  Carl Ebeling,et al.  Architecture design of reconfigurable pipelined datapaths , 1999, Proceedings 20th Anniversary Conference on Advanced Research in VLSI.

[5]  Yutai Ma,et al.  An effective memory addressing scheme for FFT processors , 1999, IEEE Trans. Signal Process..

[6]  Peter Pirsch,et al.  VLSI architectures for video compression-a survey , 1995, Proc. IEEE.

[7]  Carl Ebeling,et al.  Configurable computing: the catalyst for high-performance architectures , 1997, Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors.

[8]  George Varghese,et al.  Design Methodology of a Low-Energy Reconfigurable Single-Chip DSP System , 2001, J. VLSI Signal Process..

[9]  K. Rajagopalan,et al.  A flexible multiplication unit for an FPGA logic block , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[10]  Russell Tessier,et al.  Recon.gurable Computing and Digital Signal Processing: Past, Present, and Future , 2001 .

[11]  Fuyun Ling,et al.  Corrections to 'The LMS algorithm with delayed coefficient adaptation' , 1992, IEEE Trans. Signal Process..

[12]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[13]  Farid N. Najm,et al.  McPOWER: a Monte Carlo approach to power estimation , 1992, 1992 IEEE/ACM International Conference on Computer-Aided Design.

[14]  Alvin M. Despain,et al.  Pipeline and Parallel-Pipeline FFT Processors for VLSI Implementations , 1984, IEEE Transactions on Computers.

[15]  Jan M. Rabaey,et al.  Evaluation of a Low-Power Reconfigurable DSP Architecture , 1998, IPPS/SPDP Workshops.

[16]  W. Siu,et al.  On the realization of discrete cosine transform using the distributed arithmetic , 1992 .

[17]  C. K. Yuen,et al.  Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[18]  Andrew D. Booth,et al.  A SIGNED BINARY MULTIPLICATION TECHNIQUE , 1951 .

[19]  André DeHon,et al.  Reconfigurable architectures for general-purpose computing , 1996 .

[20]  Earl E. Swartzlander,et al.  Low Power Arithmetic Components , 1996 .

[21]  Kazuo Yano,et al.  A 3.8-ns CMOS 16*16-b multiplier using complementary pass-transistor logic , 1990 .

[22]  Peter Y. K. Cheung,et al.  On the viability of FPGA-based integrated coprocessors , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[23]  Fuyun Ling,et al.  The LMS algorithm with delayed coefficient adaptation , 1989, IEEE Trans. Acoust. Speech Signal Process..

[24]  Luca Benini,et al.  Analysis of glitch power dissipation in CMOS ICs , 1995, ISLPED '95.

[25]  B. Lee A new algorithm to compute the discrete cosine Transform , 1984 .

[26]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..

[27]  E. V. Jones,et al.  A pipelined FFT processor for word-sequential data , 1989, IEEE Trans. Acoust. Speech Signal Process..

[28]  Roger F. Woods,et al.  The impact of data characteristics and hardware topology on hardware selection for low power DSP , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[29]  P. Duhamel,et al.  `Split radix' FFT algorithm , 1984 .

[30]  D. Cohen Simplified control of FFT hardware , 1976 .

[31]  Russell Tessier,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Reconfigurable Computing for Digital Signal Processing: A Survey ∗ , 1999 .

[32]  Rob A. Rutenbar,et al.  Exploring multiplier architecture and layout for low power , 1996, Proceedings of Custom Integrated Circuits Conference.

[33]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[34]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[35]  Tughrul Arslan,et al.  Scheme for reducing size of coefficient memory in FFT processor , 2002 .

[36]  Alvin M. Despain,et al.  Fourier Transform Computers Using CORDIC Iterations , 1974, IEEE Transactions on Computers.