Towards Scalable and Reliable In-Memory Storage System: A Case Study with Redis

In recent years, in-memory key-value storage systems have become more and more popular in solving real-time and interactive tasks. Compared with disks, memories have much higher throughput and lower latency which enables them to process data requests with much higher performance. However, since memories have much smaller capacity than disks, how to expand the capacity of in-memory storage system while maintain its high performance become a crucial problem. At the same time, since data in memories are non-persistent, the data may be lost when the system is down. In this paper, we make a case study with Redis, which is one popular in-memory key-value storage system. We find that although the latest release of Redis support clustering so that data can be stored in distributed nodes to support a larger storage capacity, its performance is limited by its decentralized design that clients usually need two connections to get their request served. To make the system more scalable, we propose a Clientside Key-to-Node Caching method that can help direct request to the right service node. Experimental results show that by applying this technique, it can significantly improve the system's performance by near 2 times. We also find that although Redis supports data replication on slave nodes to ensure data safety, it still gets a chance of losing a part of the data due to a weak consistency between master and slave nodes that its defective order of data replication and request reply may lead to losing data without notifying the client. To make it more reliable, we propose a Master-slave Semi Synchronization method which utilizes TCP protocol to ensure the order of data replication and request reply so that when a client receives an "OK" message, the corresponding data must have been replicated. With a significant improvement in data reliability, its performance overhead is limited within 5%.

[1]  Lei Wang,et al.  Optimizing Event Polling for Network-Intensive Applications: A Case Study on Redis , 2013, 2013 International Conference on Parallel and Distributed Systems.

[2]  Rui Zhang,et al.  The HV-tree , 2010, Proc. VLDB Endow..

[3]  Eitan Frachtenberg,et al.  Many-core key-value store , 2011, 2011 International Green Computing Conference and Workshops.

[4]  Vasaka Visoottiviseth,et al.  Toward Fast and Scalable Key-Value Stores Based on User Space TCP/IP Stack , 2015, AINTEC.

[5]  Kotagiri Ramamohanarao,et al.  MASCOT: Fast and Highly Scalable SVM Cross-Validation Using GPUs and SSDs , 2014, 2014 IEEE International Conference on Data Mining.

[6]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[7]  Matthew B. Dwyer,et al.  Green: reducing, reusing and recycling constraints in program analysis , 2012, SIGSOFT FSE.

[8]  Beng Chin Ooi,et al.  In-memory Databases: Challenges and Opportunities From Software and Hardware Perspectives , 2015, SGMD.

[9]  Pradeep Dubey,et al.  Architecting to achieve a billion requests per second throughput on a single key-value store server platform , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[10]  Ivan Ganchev,et al.  A Distributed Redis Framework for Use in the UCWW , 2014, 2014 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[11]  Yuan Yuan,et al.  Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores , 2015, Proc. VLDB Endow..

[12]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[13]  Minh Hieu Nguyen,et al.  Zing Database: high-performance key-value store for large-scale storage service , 2014, Vietnam Journal of Computer Science.

[14]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[15]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[16]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[17]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[18]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[19]  Yong Gao,et al.  A cache framework for geographical feature store , 2012, 2012 20th International Conference on Geoinformatics.

[20]  Mike O'Connor,et al.  MemcachedGPU: scaling-up scale-out key-value stores , 2015, SoCC.

[21]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[22]  Ion Stoica,et al.  Succinct: Enabling Queries on Compressed Data , 2015, NSDI.

[23]  Ke Wang,et al.  ZHT: A Light-Weight Reliable Persistent Dynamic Scalable Zero-Hop Distributed Hash Table , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[24]  Lei Wang,et al.  Optimizing Event Polling for Network-Intensive Applications: A Case Study on Redis , 2013, ICPADS 2013.

[25]  Carlo Ghezzi,et al.  Enhancing reuse of constraint solutions to improve symbolic execution , 2015, ISSTA.

[26]  Josiah L. Carlson,et al.  Redis in Action , 2013 .

[27]  Beng Chin Ooi,et al.  In-Memory Big Data Management and Processing: A Survey , 2015, IEEE Transactions on Knowledge and Data Engineering.

[28]  Min Song,et al.  Analyzing the Political Landscape of 2012 Korean Presidential Election in Twitter , 2014, IEEE Intelligent Systems.

[29]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[30]  Gang Chen,et al.  Efficient In-memory Data Management: An Analysis , 2014, Proc. VLDB Endow..

[31]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[32]  Juan Julián Merelo Guervós,et al.  Is there a free lunch for cloud-based evolutionary algorithms? , 2013, 2013 IEEE Congress on Evolutionary Computation.