SEER-MCache: A Prefetchable Memory Object Caching System for IoT Real-Time Data Processing

Memory object caching systems, such as Memcached and Redis, have been proved to be a simple and high-efficient middleware for improving the performance of Internet of Things (IoT) devices querying the database in cloud. However, its performance guarantee is built on the fact that the target data, queried by the IoT device, will be accessed many times and hit in the caching system. Therefore, when database system is handling the unrepeated IoT queries, it usually presents the suboptimal performance, which greatly impairs the efficiency of real-time data processing on IoT devices. To improve this issue, we propose Seer-MCache, the memory object caching system with a smart prefetching (read-ahead) function, to fill up the caching system with the desired data before the intensive IoT queries arriving. Seer-MCache includes a set of rules to launch the specific behaviors of read-head. These rules are able to be customized according to the workload characteristics and system load. We implement a prototype system in Redis (caching layer) and MySQL server (database system). Extensive experiments are conducted to verify the effectiveness of Seer-MCache, the results show that Seer-MCache can improve the performance of read-intensive workload up to 61% (39.5% in average). Meanwhile, the cost of the read-ahead behavior is moderate and controllable.

[1]  Mianxiong Dong,et al.  Multiobjective Optimization in Cloud Brokering Systems for Connected Internet of Things , 2017, IEEE Internet of Things Journal.

[2]  Ali Raza Butt,et al.  An in-memory object caching framework with adaptive load balancing , 2015, EuroSys.

[3]  Hongming Cai,et al.  An IoT-Oriented Data Storage Framework in Cloud Computing Platform , 2014, IEEE Transactions on Industrial Informatics.

[4]  Hai Huang,et al.  Understanding performance implications of nested file systems in a virtualized environment , 2012, FAST.

[5]  Chris Douglas,et al.  Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics , 2017, SIGMOD Conference.

[6]  Hongsheng Xi,et al.  On the design of a new Linux readahead framework , 2008, OPSR.

[7]  Onur Mutlu,et al.  Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks , 2014, ACM Trans. Archit. Code Optim..

[8]  Abhishek Bhattacharjee,et al.  Translation-Triggered Prefetching , 2017, ASPLOS.

[9]  Xianbin Wang,et al.  Recursive Principal Component Analysis-Based Data Outlier Detection and Sensor Data Aggregation in IoT Systems , 2017, IEEE Internet of Things Journal.

[10]  Hirozumi Yamaguchi,et al.  Middleware for Proximity Distributed Real-Time Processing of IoT Data Flows , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[11]  Shin-Dug Kim,et al.  A self-learning pattern adaptive prefetching method for big data applications , 2017, Sustain. Comput. Informatics Syst..

[12]  Chao-Lin Wu,et al.  An Efficient Data Storage Method of NoSQL Database for HEM Mobile Applications in IoT , 2014, 2014 IEEE International Conference on Internet of Things(iThings), and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom).

[13]  Donald Kossmann,et al.  Fast Scans on Key-Value Stores , 2017, Proc. VLDB Endow..

[14]  Richard W. Vuduc,et al.  When Prefetching Works, When It Doesn’t, and Why , 2012, TACO.

[15]  Janko Calic,et al.  Efficient key-frame extraction and video analysis , 2002, Proceedings. International Conference on Information Technology: Coding and Computing.

[16]  Enhong Chen,et al.  KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC , 2017, SOSP.

[17]  Yukikazu Nakamoto,et al.  A Distributed Graph Database for the Data Management of IoT Systems , 2016, 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).

[18]  Yoshiaki Tanaka,et al.  Markov-Decision-Process-Assisted Consumer Scheduling in a Networked Smart Grid , 2017, IEEE Access.

[19]  Guangyu Sun,et al.  Improving Memory Access Performance of In-Memory Key-Value Store Using Data Prefetching Techniques , 2015, APPT.

[20]  Carsten Binnig,et al.  The End of a Myth: Distributed Transaction Can Scale , 2016, Proc. VLDB Endow..

[21]  Saurabh Bagchi,et al.  Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads , 2017, Middleware.

[22]  Chunqiang Tang,et al.  FVD: A High-Performance Virtual Machine Image Format for Cloud , 2011, USENIX Annual Technical Conference.

[23]  Walid Saad,et al.  A Data Prefetching Model for Desktop Grids and the Condor Use Case , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[24]  Panos Kalnis,et al.  A Survey and Experimental Comparison of Distributed SPARQL Engines for Very Large RDF Data , 2017, Proc. VLDB Endow..

[25]  D. M. Bhalerao,et al.  MySQL and NoSQL database comparison for IoT application , 2016, 2016 IEEE International Conference on Advances in Computer Applications (ICACA).

[26]  Xiaoning Ding,et al.  A Prefetching Scheme Exploiting both Data Layout and Access History on Disk , 2013, TOS.

[27]  Cheng Li,et al.  Fine-grained consistency for geo-replicated systems , 2018, USENIX Annual Technical Conference.

[28]  Daniel C. Zilio,et al.  Recommending XML physical designs for XML databases , 2012, The VLDB Journal.