A Fine-grained Pipelined Implementation of the LINPACK Benchmark on FPGAs
暂无分享,去创建一个
Yong Dou | Jingfei Jiang | Miao Wang | Yuanwu Lei | Jie Zhou | Guiming Wu | Y. Dou | Yuanwu Lei | Miao Wang | Guiming Wu | Jie Zhou | Jingfei Jiang
[1] Brent E. Nelson,et al. Novel Optimizations for Hardware Floating-Point Units in a Modern FPGA Architecture , 2002, FPL.
[2] Brent E. Nelson,et al. Tradeoffs of designing floating-point division and square root on Virtex FPGAs , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..
[3] Aravind Dasu,et al. Performance of a LU decomposition on a multi-FPGA system compared to a low power commodity microprocessor system , 2007, Scalable Comput. Pract. Exp..
[4] Philip Heng Wai Leong,et al. FPGA Based Acceleration of the Linpack Benchmark: A High Level Code Transformation Approach , 2006, 2006 International Conference on Field Programmable Logic and Applications.
[5] Sanjay V. Rajopadhye,et al. An Improved Systolic Architecture for LU Decomposition , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).
[6] Viktor K. Prasanna,et al. High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware , 2008, IEEE Transactions on Computers.
[7] Viktor K. Prasanna,et al. A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[8] Russell Tessier,et al. Floating point unit generation and evaluation for FPGAs , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..
[9] Karl S. Hemmert,et al. Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[10] Viktor K. Prasanna,et al. Time and Energy Efficient Matrix Factorization Using FPGAs , 2003, FPL.
[11] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.
[12] Gregory D. Peterson,et al. High-Performance Mixed-Precision Linear Solver for FPGAs , 2008, IEEE Transactions on Computers.
[13] Viktor K. Prasanna,et al. High Performance Linear Algebra Operations on Reconfigurable Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[14] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..
[15] Keshab K. Parhi,et al. A Fast Radix-4 Division Algorithm and Its Architecture , 1995, IEEE Trans. Computers.