Dynamic distributed dimensional data model (D4M) database and computation system

A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but lack interfaces to support efficient development of mathematically based analytics. D4M (Dynamic Distributed Dimensional Data Model) has been developed to provide a mathematically rich interface to tuple stores (and structured query language “SQL” databases). D4M allows linear algebra to be readily applied to databases. Using D4M, it is possible to create composable analytics with significantly less effort than using traditional approaches. This work describes the D4M technology and its application and performance.

[1]  Jeremy Kepner,et al.  'pMATLAB Parallel MATLAB Library' , 2007, Int. J. High Perform. Comput. Appl..

[2]  David A. Bader Designing Scalable Synthetic Compact Applications for Benchmarking High Productivity Computing Systems , 2006 .

[3]  Jeremy Kepner,et al.  A scalable signal processing architecture for massive graph analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Peter Butkovic,et al.  Strong Regularity of Matrices - A Survey of Results , 1994, Discret. Appl. Math..

[5]  Hyung Seok Kim,et al.  Interactive Grid Computing at Lincoln Laboratory , 2006 .

[6]  Jeremy Kepner Parallel MATLAB - for Multicore and Multinode Computers , 2009, Software, environments, tools.

[7]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[8]  Jeremy Kepner,et al.  MatlabMPI , 2004, J. Parallel Distributed Comput..

[9]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[10]  J. PLÁVKA LINEAR INDEPENDENCES IN BOTTLENECK ALGEBRAAND THEIR COHERENCES WITH MATROIDSJ , 1995 .

[11]  Jeremy Kepner,et al.  pMATLAB: Parallel MATLAB Library for Signal Processing Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  Martin Gavalec,et al.  Simple image set of linear mappings in a max-min algebra , 2007, Discret. Appl. Math..