CDStore: Toward Reliable, Secure, and Cost-Efficient Cloud Storage via Convergent Dispersal

CDStore is a unified, multicloud storage solution for users to outsource backup data with reliability, security, and cost-efficiency guarantees. CDStore builds on an augmented secret-sharing scheme called convergent dispersal, which supports deduplication by using deterministic, content-derived hashes as input to secret sharing. CDStore's design is presented here, with an emphasis on how it combines convergent dispersal with two-stage deduplication to achieve both bandwidth and storage savings while robustly diverting side-channel attacks (launched by malicious users on the client side). A cost analysis shows that CDStore yields significant savings over baseline cloud storage solutions.

[1]  Dan Dobre,et al.  Hybris: Robust Hybrid Cloud Storage , 2014, SoCC.

[2]  Ethan L. Miller,et al.  The effectiveness of deduplication on virtual machine disk images , 2009, SYSTOR '09.

[3]  Mihir Bellare,et al.  Optimal Asymmetric Encryption , 1994, EUROCRYPT.

[4]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[5]  Ronald L. Rivest,et al.  All-or-Nothing Encryption and the Package Transform , 1997, FSE.

[6]  Jin Li,et al.  Convergent Dispersal: Toward Storage-Efficient Security in a Cloud-of-Clouds , 2014, HotCloud.

[7]  Michael O. Rabin,et al.  Efficient dispersal of information for security, load balancing, and fault tolerance , 1989, JACM.

[8]  Brian Warner,et al.  Tahoe: the least-authority filesystem , 2008, StorageSS '08.

[9]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[10]  Hao Jiang,et al.  Multi-level Selective Deduplication for VM Snapshots in Cloud Storage , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[11]  Brian D. Noble,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy , 2022 .

[12]  Ethan Katz-Bassett,et al.  SPANStore: cost-effective geo-replicated storage spanning multiple cloud services , 2013, SOSP.

[13]  Ethan L. Miller,et al.  Screaming fast Galois field arithmetic using intel SIMD instructions , 2013, FAST.

[14]  Hugo Krawczyk,et al.  Secret Sharing Made Short , 1994, CRYPTO.

[15]  Michael M. Swift,et al.  A Day Late and a Dollar Short: The Case for Research on Cloud Billing Systems , 2014, HotCloud.

[16]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.

[17]  Karl Aberer,et al.  Scalia: An adaptive scheme for efficient multi-cloud storage , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  James S. Plank A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[19]  Marvin Theimer,et al.  Reclaiming space from duplicate files in a serverless distributed file system , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[20]  Mihir Bellare,et al.  DupLESS: Server-Aided Encryption for Deduplicated Storage , 2013, USENIX Security Symposium.

[21]  Fred Douglis,et al.  Characteristics of backup workloads in production systems , 2012, FAST.

[22]  Ari Juels,et al.  HAIL: a high-availability and integrity layer for cloud storage , 2009, CCS.

[23]  Le Zhang,et al.  Fast and Secure Laptop Backups with Encrypted De-duplication , 2010, LISA.

[24]  Jin Li,et al.  Secure Deduplication with Efficient and Reliable Convergent Key Management , 2014, IEEE Transactions on Parallel and Distributed Systems.

[25]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[26]  John Black,et al.  Compare-by-Hash: A Reasoned Analysis , 2006, USENIX ATC, General Track.

[27]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[28]  Erez Zadok,et al.  Generating Realistic Datasets for Deduplication Analysis , 2012, USENIX Annual Technical Conference.

[29]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[30]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[31]  Miguel Correia,et al.  DepSky: Dependable and Secure Storage in a Cloud-of-Clouds , 2013, TOS.

[32]  Ethan L. Miller,et al.  POTSHARDS—a secure, recoverable, long-term archival storage system , 2009, TOS.

[33]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[34]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[35]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[36]  David Bermbach,et al.  MetaStorage: A Federated Cloud Storage System to Manage Consistency-Latency Tradeoffs , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[37]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[38]  Victor Boyko,et al.  On the Security Properties of OAEP as an All-or-Nothing Transform , 1999, CRYPTO.

[39]  Yang Tang,et al.  NCCloud: applying network coding for the storage repair in a cloud-of-clouds , 2012, FAST.

[40]  André Brinkmann,et al.  File recipe compression in data deduplication systems , 2013, FAST.

[41]  Dan Feng,et al.  Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information , 2014, USENIX Annual Technical Conference.

[42]  Mark Lillibridge,et al.  Improving restore speed for backup systems that use inline chunk-based deduplication , 2013, FAST.

[43]  Ying Ding,et al.  Note: Correction to the 1997 tutorial on Reed–Solomon coding , 2005, Softw. Pract. Exp..

[44]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[45]  Michal Kaczmarczyk,et al.  Reducing impact of data fragmentation caused by in-line deduplication , 2012, SYSTOR '12.

[46]  Edgar R. Weippl,et al.  Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space , 2011, USENIX Security Symposium.

[47]  Darrell D. E. Long,et al.  Secure data deduplication , 2008, StorageSS '08.

[48]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[49]  Mihir Bellare,et al.  Message-Locked Encryption and Secure Deduplication , 2013, EUROCRYPT.

[50]  Catherine A. Meadows,et al.  Security of Ramp Schemes , 1985, CRYPTO.

[51]  James S. Plank,et al.  AONT-RS: Blending Security and Performance in Dispersed Storage Systems , 2011, FAST.

[52]  Ramakrishna Kotla,et al.  SafeStore: A Durable and Practical Storage System , 2007, USENIX Annual Technical Conference.

[53]  Miguel Correia,et al.  SCFS: A Shared Cloud-backed File System , 2014, USENIX Annual Technical Conference.