Enumeration System on HBase for Low-Latency
暂无分享,去创建一个
HBase is a popular distributed Key/Value storage system based on the idea of BigTable. It is being used in many data-centers, such as Facebook and Twitter, for their portability and scalability. For the system, low-latency and large storage is expected when used in industry. However, it is time consuming when retrieving one column via another one. Many technologies were considered to solve the problem. One approach is to add secondary index for HBase such as h index, which achieves high performance in retrieving. Unfortunately, when one column is of limited kinds of data, secondary index cannot reduce storage consumption when accelerating the retrieval. In this paper, we present a novel design of HBase to reduce storage consumption as well as accelerating the retrieval in the above situation. We design an enumeration system for HBase and provide an interface to create enumeration for specific column in tables. Our performance evaluation reveals that it achieves 2.27x improvement in retrieval and 12x reduction in storage compared with non-enumeration in HBase.
[1] Dhabaleswar K. Panda,et al. High-Performance Design of HBase with RDMA over InfiniBand , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[2] Peter Deutsch,et al. GZIP file format specification version 4.3 , 1996, RFC.
[3] Rick Cattell,et al. Scalable SQL and NoSQL data stores , 2011, SGMD.
[4] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.