FPGA Design and Implementation of Matrix Multiplication Architecture by PPI-MO Techniques

Matrix multiplication is the kernel operation used in many transform, image and discrete signal processing application. We develop new algorithms and new techniques for matrix multiplication on configurable devices. In this paper, we have proposed three designs for matrix-matrix multiplication. These design reduced hardware complexity, throughput rate and different input/output data format to match different application needs. These techniques have been designed implementation on Virtex-4 FPGA. We have synthesized the proposed designs and the existing design using Synopsys tools. Interestingly, the proposed parallel-fixed-input and multiple-output (PPI-MO) structure consumes 40% less energy than other two proposed structures and 70% less energy than the existing structure.

[1]  Saudi Arabia,et al.  FPGA Design and Implementation of Matrix Multiplier Architectures for Image and Signal Processing Applications , 2010 .

[2]  Massoud Pedram,et al.  Design Technologies for Low Power VLSI , 1995 .

[3]  Nitin Meena,et al.  Efficient Hardware Design for Implementation ofMatrix Multiplication by using PPI-SO , 2013 .

[4]  Enrico Macii,et al.  Designing low-power circuits: practical recipes , 2001 .

[5]  Keshab K. Parhi,et al.  VLSI digital signal processing systems , 1999 .

[6]  Peter A. Beerel,et al.  An asynchronous pipeline comparisons with application to DCT matrix-vector multiplication , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[7]  M. Horowitz,et al.  Low-power digital design , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.

[8]  Jack Belzer,et al.  Encyclopedia of Computer Science and Technology , 2002 .

[9]  Pramod Kumar Meher,et al.  New Approach to Look-Up-Table Design and Memory-Based Realization of FIR Digital Filter , 2010, IEEE Transactions on Circuits and Systems I: Regular Papers.

[10]  Viktor K. Prasanna,et al.  Energy- and time-efficient matrix multiplication on FPGAs , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  John Lloyd,et al.  Parallel formulations of matrix-vector multiplication for matrices with large aspect ratios , 1996, Proceedings of 4th Euromicro Workshop on Parallel and Distributed Processing.

[12]  Pramod Kumar Meher,et al.  Hardware-Efficient Systolization of DA-Based Calculation of Finite Digital Convolution , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[13]  G. Clark,et al.  Reference , 2008 .