ASLM: Adaptive Single Layer Model for Learned Index

Index structures such as B-trees are important tools that DBAs use to enhance the performance of data access. However, with the approaching of the big data era, the amount of data generated in different domains have exploded. A recent study has shown that indexes consume about 55% of total memory in a state-of-the-art in-memory DBMS. Building indexes in traditional ways have encountered a bottleneck. Recent work proposes to use neural network models to replace B-tree and many other indexes. However, the proposed model is heavy, inaccuracy, and has failed to consider model updating. In this paper, a novel, simple learned index called adaptive single layer model is proposed to replace the B-tree index. The proposed model, using two data partition methods, is well-organized and can be applied to different workloads. Updating is also taken into consideration. The proposed model incorporates two data partition methods is evaluated in two datasets. The results show that the prediction error is reduced by around 50% and demonstrate that the proposed model is more accurate, stable and effective than the currently existing model.

[1]  Anastasia Ailamaki,et al.  BF-Tree: Approximate Tree Indexing , 2014, Proc. VLDB Endow..

[2]  Hans-Arno Jacobsen,et al.  A Hybrid B+-tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms , 2016, SIGMOD Conference.

[3]  Goetz Graefe B-tree indexes, interpolation search, and skew , 2006, DaMoN '06.

[4]  Marcin Zukowski,et al.  Super-Scalar RAM-CPU Cache Compression , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5]  Pradeep Dubey,et al.  FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.

[6]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[7]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[8]  Donald Kossmann,et al.  Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia , 2013, Proc. VLDB Endow..

[9]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[10]  Goetz Graefe,et al.  B-tree indexes and CPU caches , 2001, Proceedings 17th International Conference on Data Engineering.

[11]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[12]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[13]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[14]  Jens Dittrich,et al.  A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing , 2015, Proc. VLDB Endow..

[15]  Joan Boyar,et al.  Efficient Rebalancing of Chromatic Search Trees , 1992, J. Comput. Syst. Sci..

[16]  Rudolf Bayer,et al.  Symmetric binary B-Trees: Data structure and maintenance algorithms , 1972, Acta Informatica.

[17]  Krzysztof Kaczmarski B + -Tree Optimized for GPGPU , 2012, OTM Conferences.

[18]  Lin Ma,et al.  Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes , 2016, SIGMOD Conference.

[19]  Rudolf Bayer,et al.  Prefix B-trees , 1977, TODS.

[20]  Kenneth A. Ross,et al.  Making B+-Trees Cache Conscious in Main Memory , 2000, SIGMOD Conference.