论文信息 - Parallel Implementation of Cholesky LLT-Algorithm in FPGA-Based Processor

Parallel Implementation of Cholesky LLT-Algorithm in FPGA-Based Processor

The fixed-size processor array architecture, which is intended for realization of matrix LLT-decomposition based on Cholesky algorithm, is proposed. In order to implement this architecture in modern FPGA devices, the arithmetic unit (AU) operating in the rational fraction arithmetic is designed. The AU is intended for configuring in the Xilinx Virtex4 FPGAs, and its hardware complexity is much less than the complexity of similar AUs operating with floating-point numbers.

[1] Reinhard Männer,et al. Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[2] Warren J. Gross,et al. Sparse Matrix-Vector Multiplication for Finite Element Method Matrices on FPGAs , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[3] Oleg Maslennikov,et al. FPGA Implementation of the Conjugate Gradient Method , 2005, PPAM.

[4] Dennis W. Prather,et al. FPGA-based acceleration of the 3D finite-difference time-domain method , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[5] Oleg Maslennikov,et al. Configurable Microprocessor Array for DSP Applications , 2003, PPAM.

[6] Mary Jane Irwin,et al. A rational arithmetic processor , 1981, 1981 IEEE 5th Symposium on Computer Arithmetic (ARITH).

[7] Robert Strzodka,et al. Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[8] Karl S. Hemmert,et al. Embedded floating-point units in FPGAs , 2006, FPGA '06.

[9] Chika O. Nwankpa,et al. High-Performance Linear Algebra Processor using FPGA , 2004 .

[10] Jirí Kadlec,et al. Logarithmic Number System and Floating-Point Arithmetics on FPGA , 2002, FPL.

[11] S. Kung,et al. VLSI Array processors , 1985, IEEE ASSP Magazine.

[12] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.

[13] Patrice Quinton,et al. Systolic algorithms and architectures , 1987 .

[14] Peter M. Athanas,et al. Examining the Viability of FPGA Supercomputing , 2007, EURASIP J. Embed. Syst..

[15] Gene H. Golub,et al. Matrix computations , 1983 .

[16] Karl S. Hemmert,et al. Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[17] Denis Trystram,et al. Parallel algorithms and architectures , 1995 .