Floating-point sparse matrix-vector multiply for FPGAs
暂无分享,去创建一个
[1] L. Trefethen,et al. Numerical linear algebra , 1997 .
[2] Frederic T. Chong,et al. METRO: a router architecture for high-performance, short-haul routing networks , 1994, ISCA '94.
[3] Keith D. Underwood,et al. FPGAs vs. CPUs: trends in peak floating-point performance , 2004, FPGA '04.
[4] Richard Vuduc,et al. Automatic performance tuning of sparse matrix kernels , 2003 .
[5] Dorit S. Hochba,et al. Approximation Algorithms for NP-Hard Problems , 1997, SIGA.
[6] R. Melham. A systolic accelerator for the iterative solution of sparse linear systems , 1989 .
[7] Andrew B. Kahng,et al. Improved algorithms for hypergraph bipartitioning , 2000, ASP-DAC '00.
[8] Jung Ho Ahn,et al. Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[9] John Wawrzynek,et al. Stochastic, spatial routing for hypergraphs, trees, and meshes , 2003, FPGA '03.
[10] Pavle Belanovic,et al. A Library of Parameterized Floating-Point Modules and Their Use , 2002, FPL.
[11] James Demmel,et al. Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[12] Roy L. Russo,et al. On a Pin Versus Block Relationship For Partitions of Logic Graphs , 1971, IEEE Transactions on Computers.
[13] Charles E. Leiserson,et al. Optimizing Synchronous Circuitry by Retiming (Preliminary Version) , 1983 .
[14] J. Shalf,et al. Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[15] Brad L. Hutchings,et al. JHDL-an HDL for reconfigurable systems , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).