Improving Cache Performance for Large-Scale Photo Stores via Heuristic Prefetching Scheme

Photo service providers are facing critical challenges of dealing with the huge amount of photo storage, typically in a magnitude of billions of photos, while ensuring national-wide or world-wide satisfactory user experiences. Distributed photo caching architecture is widely deployed to meet high performance expectations, where efficient still mysterious caching policies play essential roles. In this work, we present a comprehensive study on internet-scale photo caching algorithms in the case of QQPhoto from Tencent Inc., the largest social network service company in China. We unveil that even advanced cache algorithms can only perform at a similar level as simple baseline algorithms and there still exists a large performance gap between these cache algorithms and the theoretically optimal algorithm due to the complicated access behaviors in such a large multi-tenant environment. We then expound the reasons behind this phenomenon via extensively investigating the characteristics of QQPhoto workloads. Finally, in order to realistically further improve QQPhoto cache efficiency, we propose to incorporate a prefetcher in the cache stack based on the observed immediacy feature that is unique to the QQPhoto workload. The prefetcher proactively prefetches selected photos into cache before they are requested for the first time to eliminate compulsory misses and promote hit ratios. Our extensive evaluation results show that with appropriate prefetching we improve the cache hit ratio by up to 7.4 percent, while reducing the average access latency by 6.9 percent at a marginal cost of 4.14 percent backend network traffic compared to the original system that performs no prefetching.

[1]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[2]  Ke Zhou,et al.  LX-SSD : Enhancing the Lifespan of NAND Flash-based Memory via Recycling Invalid Pages , 2017 .

[3]  Ryan Stutsman,et al.  Memshare: a Dynamic Multi-tenant Key-value Cache , 2017, USENIX Annual Technical Conference.

[4]  Ali Ghodsi,et al.  FairRide: Near-Optimal, Fair Cache Sharing , 2016, NSDI.

[5]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[6]  Giri Narasimhan,et al.  CacheDedup: In-line Deduplication for Flash Caching , 2016, FAST.

[7]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[8]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[9]  Songqing Chen,et al.  The stretched exponential distribution of internet media access patterns , 2008, PODC '08.

[10]  Zili Shao,et al.  DIDACache: A Deep Integration of Device and Application for Flash Based Key-Value Caching , 2017, FAST.

[11]  Peter Desnoyers,et al.  Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality , 2016, USENIX Annual Technical Conference.

[12]  Michael J. Freedman,et al.  Hyperbolic Caching: Flexible Caching for Web Applications , 2017, USENIX ATC.

[13]  Dan Feng,et al.  Improving flash-based disk cache with Lazy Adaptive Replacement , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[14]  Jinchun Kim,et al.  Path confidence based lookahead prefetching , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[15]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[16]  Irfan Ahmad,et al.  Cache Modeling and Optimization using Miniature Simulations , 2017, USENIX Annual Technical Conference.

[17]  Robbert van Renesse,et al.  Proactive Cache Placement on Cooperative Client Caches for Online Social Networks , 2016, IEEE Transactions on Parallel and Distributed Systems.

[18]  Kai Li,et al.  RIPQ: Advanced Photo Caching on Flash for Facebook , 2015, FAST.

[19]  Cheng Li,et al.  Nitro: A Capacity-Optimized SSD Cache for Primary Storage , 2014, USENIX Annual Technical Conference.

[20]  Kai Li,et al.  Popularity Prediction of Facebook Videos for Higher Quality Streaming , 2017, USENIX Annual Technical Conference.

[21]  Daniel Sánchez,et al.  Modeling cache performance beyond LRU , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[22]  Daniel Sánchez,et al.  Talus: A simple way to remove cliffs in cache performance , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[23]  Berkant Barla Cambazoglu,et al.  Improved Caching Techniques for Large-Scale Image Hosting Services , 2016, SIGIR.

[24]  Seth H. Pugsley,et al.  Efficiently prefetching complex address patterns , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[25]  Antony I. T. Rowstron,et al.  Software-defined caching: managing caches in multi-tenant data centers , 2015, SoCC.

[26]  Cheng Li,et al.  Pannier: A Container-based Flash Cache for Compound Objects , 2015, Middleware.

[27]  Anirban Dasgupta,et al.  Caching with Dual Costs , 2017, WWW.

[28]  Yuanyuan Zhou,et al.  The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.

[29]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[30]  Abhishek Bhattacharjee,et al.  Translation-Triggered Prefetching , 2017, ASPLOS.

[31]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[32]  Reena Panda,et al.  Prefetching for cloud workloads: An analysis based on address patterns , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[33]  Hjörtur Björnsson,et al.  Dynamic performance profiling of cloud caches , 2013, SoCC.

[34]  Michael M. Swift,et al.  FlashTier: a lightweight, consistent and durable storage cache , 2012, EuroSys '12.

[35]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[36]  Hwanju Kim,et al.  Request-Oriented Durable Write Caching for Application Performance , 2015, USENIX Annual Technical Conference.

[37]  Ping Huang,et al.  An aggressive worn-out flash block management scheme to alleviate SSD performance degradation , 2014, EuroSys '14.

[38]  Sachin Katti,et al.  Cliffhanger: Scaling Performance Cliffs in Web Memory Caches , 2016, NSDI.

[39]  Ke Zhou,et al.  FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance , 2014, USENIX Annual Technical Conference.

[40]  Song Jiang,et al.  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.

[41]  Daniel Sánchez,et al.  Maximizing Cache Performance Under Uncertainty , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[42]  Robbert van Renesse,et al.  An analysis of Facebook photo caching , 2013, SOSP.