NIOSIT: efficient data access for log-structured merge-tree style storage systems

Recent years, the log-structured merge-tree(LSM-tree) style storage has been widely adopted in distributed data storage systems(e.g. Bigtable and HBase) and commercial database systems(e.g. Ocean-Base, Cassandra, SQLite, etc.) to provide both large-volume storage capacity and high-performance data updates. Write operations become easier as the LSM-tree style storage avoids writing in place by updating a data copy in memory. However, read operations are affected as it requires an additional step during a data compaction to check if there exists the newest update of data record in memory, which brings many of costly empty reads in real usage since the volume of immutable data is pervasively far more massive than incremental delta data. To address this issue, we design a new network request processing mechanism to allow data access being processed in an auxiliary lightweight network communication IO thread. And a Bloom filter is incorporated with the network IO thread to effectively filter out the empty reads. We also analyze the efficiency advantage of the mechanism and introduce its detailed implementation based on the well-known OceanBase system. Experimental study using the YCSB benchmark demonstrates the proposed mechanism can significantly achieve 20 percent to 30 percent better performance than existing method.

[1]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[2]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[3]  Philip A. Bernstein,et al.  Principles of transaction processing: for the systems professional , 1996 .

[4]  Mohamed F. Mokbel,et al.  Deuteronomy: Transaction Support for Cloud Data , 2011, CIDR.

[5]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[6]  Sheldon M. Ross,et al.  Introduction to Probability Models, Eighth Edition , 1972 .

[7]  Parag Agrawal,et al.  The case for RAMCloud , 2011, Commun. ACM.

[8]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[9]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[10]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[11]  Donald Kossmann,et al.  On the Design and Scalability of Distributed Shared-Data Databases , 2015, SIGMOD Conference.

[12]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[13]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[14]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[15]  Ippokratis Pandis,et al.  Data-oriented transaction execution , 2010, Proc. VLDB Endow..

[16]  Maksym Semikin,et al.  Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes , 2017 .