Towards Network-level Efficiency for Cloud Storage Services

Cloud storage services such as Dropbox, Google Drive, and Microsoft OneDrive provide users with a convenient and reliable way to store and share data from anywhere, on any device, and at any time. The cornerstone of these services is the data synchronization (sync) operation which automatically maps the changes in users' local filesystems to the cloud via a series of network communications in a timely manner. If not designed properly, however, the tremendous amount of data sync traffic can potentially cause (financial) pains to both service providers and users. This paper addresses a simple yet critical question: Is the current data sync traffic of cloud storage services efficiently used? We first define a novel metric named TUE to quantify the Traffic Usage Efficiency} of data synchronization. Based on both real-world traces and comprehensive experiments, we study and characterize the TUE of six widely used cloud storage services. Our results demonstrate that a considerable portion of the data sync traffic is in a sense wasteful, and can be effectively avoided or significantly reduced via carefully designed data sync mechanisms. All in all, our study of TUE of cloud storage services not only provides guidance for service providers to develop more efficient, traffic-economic services, but also helps users pick appropriate services that best fit their needs and budgets.

[1]  Andreas Bergen,et al.  Client bandwidth: The forgotten metric of online storage providers , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[2]  Weidong Shi,et al.  Back to the Future: Using Magnetic Tapes in Cloud Based Storage Infrastructures , 2013, Middleware.

[3]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[4]  Srinath T. V. Setty,et al.  Depot: Cloud Storage with Minimal Trust , 2010, TOCS.

[5]  Avishay Traeger,et al.  To Zip or not to Zip: effective resource usage for real-time compression , 2013, FAST.

[6]  Ben Y. Zhao,et al.  Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services , 2013, Middleware.

[7]  Michael Vrable,et al.  BlueSky: a cloud-backed file system for the enterprise , 2012, FAST.

[8]  Dhiru Kholia,et al.  Looking Inside the (Drop) Box , 2013, WOOT.

[9]  Jeanna Neefe Matthews,et al.  The good, the bad and the ugly of consumer cloud storage , 2010, OPSR.

[10]  Aiko Pras,et al.  Benchmarking personal cloud storage , 2013, Internet Measurement Conference.

[11]  George Varghese,et al.  EndRE: An End-System Redundancy Elimination Service for Enterprises , 2010, NSDI.

[12]  David Wolinsky,et al.  An untold story of redundant clouds: making your service deployment truly reliable , 2013, HotDep.

[13]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.

[14]  栄 久米原,et al.  Wiresharkパケット解析リファレンス : Network Protocol Analyzer , 2009 .

[15]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[16]  Xiaowei Yang,et al.  Comparing Public-Cloud Providers , 2011, IEEE Internet Computing.

[17]  Aiko Pras,et al.  Inside dropbox: understanding personal cloud storage services , 2012, Internet Measurement Conference.

[18]  Yunhao Liu,et al.  T-CloudDisk: a tunable cloud storage service for flexible batched synchronization , 2013, MiddlewareDPT '13.

[19]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[20]  Edgar R. Weippl,et al.  Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space , 2011, USENIX Security Symposium.

[21]  Michael Vrable,et al.  Cumulus: Filesystem backup to the cloud , 2009, TOS.

[22]  Gang Liu,et al.  Cloud download: using cloud utilities to achieve high-quality content distribution for unpopular videos , 2011, ACM Multimedia.

[23]  Gang Liu,et al.  Cloud transcoder: bridging the format and resolution gap between internet videos and mobile devices , 2012, NOSSDAV '12.

[24]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[25]  Yanpei Chen,et al.  Design implications for enterprise storage systems via multi-dimensional trace analysis , 2011, SOSP '11.

[26]  Fred Douglis,et al.  Characteristics of backup workloads in production systems , 2012, FAST.

[27]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[28]  Philip Shilane,et al.  WAN-optimized replication of backup datasets using stream-informed delta compression , 2012, TOS.

[29]  Zhi-Li Zhang,et al.  Coarse-grained cloud synchronization mechanism design may lead to severe traffic overuse , 2013 .

[30]  Andrea C. Arpaci-Dusseau,et al.  ViewBox: integrating local file systems with cloud storage services , 2014, FAST.

[31]  Miguel Correia,et al.  DepSky: Dependable and Secure Storage in a Cloud-of-Clouds , 2013, TOS.

[32]  David Wolinsky,et al.  Heading Off Correlated Failures through Independence-as-a-Service , 2014, OSDI.