Accelerator for Sparse Machine Learning

Sparse matrix by vector multiplication (SpMV) plays a pivotal role in machine learning and data mining. We propose and investigate an SpMV accelerator, specifically designed to accelerate the sparse matrix by sparse vector multiplication (SpMSpV), and to be integrated in a CPU core. We show that our accelerator outperforms a similar solution by 70x while achieving 8x higher power efficiency, which yields an estimated 29x energy reduction for SpMSpV based applications.

[1]  Yves Robert,et al.  Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..

[2]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[3]  Dejan Markovic,et al.  A scalable sparse matrix-vector multiplication kernel for energy-efficient sparse-blas on FPGAs , 2014, FPGA.

[4]  Michael Garland,et al.  Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[5]  Viktor K. Prasanna,et al.  Sparse Matrix-Vector multiplication on FPGAs , 2005, FPGA '05.

[6]  Franz Franchetti,et al.  Accelerating sparse matrix-matrix multiplication with 3D-stacked logic-in-memory hardware , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[7]  Gregory D. Peterson,et al.  Sparse Matrix-Vector Multiplication Design on FPGAs , 2007 .

[8]  D. Parkinson,et al.  The scheduling of sparse matrix-vector multiplication on a massively parallel DAP computer , 1992, Parallel Comput..

[9]  Eriko Nurvitadhi,et al.  Fine-grained accelerators for sparse machine learning workloads , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[10]  Mark Y. Liu,et al.  A 14nm logic technology featuring 2nd-generation FinFET, air-gapped interconnects, self-aligned double patterning and a 0.0588 µm2 SRAM cell size , 2014, 2014 IEEE International Electron Devices Meeting.

[11]  Ran Ginosar,et al.  Sparse Matrix Multiplication On An Associative Processor , 2015, IEEE Transactions on Parallel and Distributed Systems.

[12]  Viktor K. Prasanna,et al.  Efficient VLSI Implementation of Iterative Solutions to Sparse Linear Systems , 1993, Parallel Comput..

[13]  Omar Wing A content-addressable systolic array for sparse matrix computation , 1985, J. Parallel Distributed Comput..