Graphulo implementation of server-side sparse matrix multiply in the Accumulo database

The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are often best performed inside the database server. The GraphBLAS standard provides a compact and efficient basis for a wide range of graph applications through a small number of sparse matrix operations. In this article, we discuss a server-side implementation of GraphBLAS sparse matrix multiplication that leverages Accumulo's native, high-performance iterators. We compare the mathematics and performance of inner and outer product implementations, and show how an outer product implementation achieves optimal performance near Accumulo's peak write rate. We offer our work as a core component to the Graphulo library that will deliver matrix math primitives for graph analytics within Accumulo.

[1]  John R. Gilbert,et al.  The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..

[2]  John Shalf,et al.  SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization , 2010 .

[3]  Ranjan Sen,et al.  Benchmarking Apache Accumulo BigData Distributed Table Store Using Its Continuous Test Suite , 2013, 2013 IEEE International Congress on Big Data.

[4]  John R. Gilbert,et al.  Parallel Sparse Matrix-Matrix Multiplication and Indexing: Implementation and Experiments , 2011, SIAM J. Sci. Comput..

[5]  David A. Bader Designing Scalable Synthetic Compact Applications for Benchmarking High Productivity Computing Systems , 2006 .

[6]  Joseph M. Hellerstein,et al.  MAD Skills: New Analysis Practices for Big Data , 2009, Proc. VLDB Endow..

[7]  Jeremy Kepner,et al.  Achieving 100,000,000 database inserts per second using Accumulo and D4M , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[8]  Scott M. Sawyer,et al.  Understanding query performance in Accumulo , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[9]  Miriam A. M. Capretz,et al.  Data management in cloud environments: NoSQL and NewSQL data stores , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[10]  Michael Stonebraker,et al.  Standards for graph algorithm primitives , 2014, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[11]  John R. Gilbert,et al.  Highly Parallel Sparse Matrix-Matrix Multiplication , 2010, ArXiv.

[12]  Jeremy Kepner,et al.  Dynamic distributed dimensional data model (D4M) database and computation system , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Jeremy Kepner,et al.  D4M 2.0 schema: A general purpose high performance schema for the Accumulo database , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[14]  Samuel Williams,et al.  High-Productivity and High-Performance Analysis of Filtered Semantic Graphs , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[15]  Jeremy Kepner,et al.  Graphulo: Linear Algebra Graph Kernels for NoSQL Databases , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[16]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[17]  Tinkara Toš,et al.  Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.

[18]  Larry Rudolph,et al.  Techniques for Parallel Manipulation of Sparse Matrices , 1989, Theor. Comput. Sci..