Evaluating and Optimizing the Storage Strategies for an Elastic Object Store

In this paper, we focus on evaluating different storage strategies of different kinds of data and their index stored in Punt Table. Punt Table is a NoSQL database designed for elastic objects storage. Punt Table uses a schema-free way to store and get the objects and builds indices to support querying the fields inside the objects. In order to achieve high throughput and low latency, Punt Table is designed using multiple content storage engine and index storage engine through two interfaces, Punt Store and Punt Index. Punt Store and Punt Index are designed as the storage layer of Punt Table. Both the objects content and their indices could choose to adopt the most suitable storage layer for current data set. Punt Table was tested and evaluated for the performance of object data and index store combination under test data sets with different single record sizes that are picked to simulate the real application scenarios. The result reveals that the proper configuration of storage layer for a particular data set could improve the throughput and drop the latency dramatically.

[1]  Chris Douglas,et al.  Walnut: a unified cloud object store , 2012, SIGMOD Conference.

[2]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[3]  Amr El Abbadi,et al.  ElasTraS: An Elastic Transactional Data Store in the Cloud , 2009, HotCloud.

[4]  Sandra Payette,et al.  Fedora: an architecture for complex objects and their relationships , 2005, International Journal on Digital Libraries.

[5]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[6]  Gagan Agrawal,et al.  Evaluating and Optimizing Indexing Schemes for a Cloud-Based Elastic Key-Value Store , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[7]  Chia Feng Lin,et al.  Database Backed by Cloud Data Store for On-premise Applications , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[8]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[9]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[10]  MacKenzie Smith,et al.  The DSpace institutional digital repository system: current functionality , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[11]  M. Mann,et al.  Effective Representation and Storage of Mass Spectrometry–Based Proteomic Data Sets for the Scientific Community , 2011, Science Signaling.