Violet: A Storage Stack for IOPS/Capacity Bifurcated Storage Environments

In this paper we describe a storage system called Violet that efficiently marries fine-grained host side data management with capacity optimized backend disk systems. Currently, for efficiency reasons, real-time analytics applications are forced to map their in-memory graph like data structures on to columnar databases or other intermediate disk friendly data structures when they are persisting these data structures to protect them from node failures. Violet provides efficient fine-grained end-to-end data management functionality that obviates the need to perform this intermediate mapping. Violet presents the following two key innovations that allow us to efficiently do this mapping between the finegrained host side data structures and capacity optimized backend disk system: 1) efficient identification of updates on the host that leverages hardware in-memory transaction mechanisms and 2) efficient streaming of fine-grained updates on to a disk using a new data structure called Fibonacci Array.

[1]  Mahadev Satyanarayanan,et al.  Lightweight recoverable virtual memory , 1993, SOSP '93.

[2]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[3]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[4]  Viktor Leis,et al.  Exploiting hardware transactional memory in main-memory databases , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[5]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[6]  Michael A. Bender,et al.  Cache-oblivious streaming B-trees , 2007, SPAA '07.

[7]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[8]  Haibo Chen,et al.  Using restricted transactional memory to build a scalable in-memory database , 2014, EuroSys '14.

[9]  Henrik Loeser,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.

[10]  Wolfgang Lehner,et al.  The Graph Story of the SAP HANA Database , 2013, BTW.

[11]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[12]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[13]  Richard F. Freitas,et al.  Storage class memory: Technology, systems and applications , 2009, 2010 IEEE Hot Chips 22 Symposium (HCS).

[14]  Ming Wu,et al.  Managing Large Graphs on Multi-Cores with Graph Awareness , 2012, USENIX Annual Technical Conference.

[15]  Jinpeng Wei,et al.  Software Persistent Memory , 2012, USENIX Annual Technical Conference.

[16]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[17]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.