FPGA Realization of FIR Filters by Efficient and Flexible Systolization Using Distributed Arithmetic

In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.

[1]  Keshab K. Parhi,et al.  VLSI digital signal processing systems , 1999 .

[2]  Basant Kumar Mohanty,et al.  Novel Flexible Systolic Mesh Architecture for Parallel VLSI Implementation of Finite Digital Convolution , 1998 .

[3]  H. T. Kung Why systolic architectures? , 1982, Computer.

[4]  Chein-Wei Jen,et al.  On the design automation of the memory-based VLSI architectures for FIR filters , 1993 .

[5]  Pramod Kumar Meher,et al.  Hardware-Efficient Systolization of DA-Based Calculation of Finite Digital Convolution , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[6]  David V. Anderson,et al.  Hardware-efficient distributed arithmetic architecture for high-order digital filters , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Andreas Antoniou,et al.  Digital Filters: Analysis, Design and Applications , 1979 .

[8]  B. E. Wells,et al.  Handel-C for rapid prototyping of VLSI coprocessors for real time systems , 2002, Proceedings of the Thirty-Fourth Southeastern Symposium on System Theory (Cat. No.02EX540).

[9]  Jürgen Teich,et al.  Automatic FIR Filter Generation for FPGAs , 2005, SAMOS.

[10]  Bede Liu,et al.  A new hardware realization of digital filters , 1974 .

[11]  Chang-Fuu Chen Implementing FIR filters with distributed arithmetic , 1985, IEEE Trans. Acoust. Speech Signal Process..

[12]  Shu-Ming Chang,et al.  FPGA implementation of FIR filter using M-bit parallel distributed arithmetic , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[13]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[14]  John G. Proakis,et al.  Digital signal processing (3rd ed.): principles, algorithms, and applications , 1996 .

[15]  John G. Proakis,et al.  Digital Signal Processing: Principles, Algorithms, and Applications , 1992 .

[16]  Jin-Gyun Chung,et al.  Efficient ROM size reduction for distributed arithmetic , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[17]  Venkatesh Krishnan,et al.  LMS adaptive filters using distributed arithmetic for high throughput , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[18]  G. Venkatesh,et al.  Area-delay tradeoff in distributed arithmetic based implementation of FIR filters , 1997, Proceedings Tenth International Conference on VLSI Design.

[19]  Basant Kumar Mohanty,et al.  Cost-effective novel flexible cell-level systolic architecture for high throughput implementation of 2-D FIR filters , 1996 .

[20]  Nicolas Demassieux,et al.  Optimal VLSI architecture for distributed arithmetic-based algorithms , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Roman Wyrzykowski,et al.  Flexible systolic architecture for VLSI FIR filters , 1992 .