论文信息 - Secondary Indexing Techniques for Key-Value Stores: Two Rings To Rule Them All

Secondary Indexing Techniques for Key-Value Stores: Two Rings To Rule Them All

Secondary indices are traditionally used in DBMS to increase the performance of queries that do not rely on the keys of the table for data reads. Many of the newer NoSQL distributed data stores, even if they provide a table-based data model such as HBase, however, do not yet have a secondary indexing feature built in. In this paper, we explore the challenges associated with indexing modern distributed table-based data stores and investigate two secondary index approaches which we have integrated within HBase. Our detailed analysis and experimental results prove the benefits of both the approaches. Further, we demonstrate that such secondary index implementation decisions cannot be made in isolation of the data distribution and that different indexing approaches can cater to different needs.

[1] Daniel J. Abadi,et al. Column oriented Database Systems , 2009, Proc. VLDB Endow..

[2] Michael Stonebraker,et al. The Case for Shared Nothing , 1985, HPTS.

[3] Daniel J. Abadi,et al. Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[4] References , 1971 .

[5] Prashant Malik,et al. Cassandra: a decentralized structured storage system , 2010, OPSR.

[6] Dennis Shasha,et al. The performance of current B-tree algorithms , 1993, TODS.

[7] Neal Leavitt,et al. Will NoSQL Databases Live Up to Their Promise? , 2010, Computer.

[8] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[9] S. B. Yao,et al. Efficient locking for concurrent operations on B-trees , 1981, TODS.

[10] Patrick E. O'Neil,et al. The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[11] Amandeep Khurana. Introduction to HBase Schema Design , 2012, login Usenix Mag..

[12] Xin-She Yang,et al. Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[13] Michael J. Carey,et al. Performance of B+ tree concurrency control algorithms , 1993, The VLDB Journal.

[14] Massimo Carro,et al. NoSQL Databases , 2014, ArXiv.

[15] Hector Garcia-Molina,et al. Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[16] Lars George,et al. HBase: The Definitive Guide , 2011 .