Dissecting UbuntuOne: Autopsy of a Global-scale Personal Cloud Back-end

Personal Cloud services, such as Dropbox or Box, have been widely adopted by users. Unfortunately, very little is known about the internal operation and general characteristics of Personal Clouds since they are proprietary services. In this paper, we focus on understanding the nature of Personal Clouds by presenting the internal structure and a measurement study of UbuntuOne (U1). We first detail the U$1$ architecture, core components involved in the U1 metadata service hosted in the datacenter of Canonical, as well as the interactions of U$1$ with Amazon S3 to outsource data storage. To our knowledge, this is the first research work to describe the internals of a large-scale Personal Cloud. Second, by means of tracing the U$1$ servers, we provide an extensive analysis of its back-end activity for one month. Our analysis includes the study of the storage workload, the user behavior and the performance of the U1 metadata store. Moreover, based on our analysis, we suggest improvements to U1 that can also benefit similar Personal Cloud systems. Finally, we contribute our dataset to the community, which is the first to contain the back-end activity of a large-scale Personal Cloud. We believe that our dataset provides unique opportunities for extending research in the field.

[1]  Jeanna Neefe Matthews,et al.  The good, the bad and the ugly of consumer cloud storage , 2010, OPSR.

[2]  Aiko Pras,et al.  Benchmarking personal cloud storage , 2013, Internet Measurement Conference.

[3]  Peter Reiher,et al.  A taxonomy of DDoS attack and DDoS defense mechanisms , 2004, CCRV.

[4]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[5]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[6]  Marc Sánchez Artigas,et al.  StackSync: bringing elasticity to dropbox-like file synchronization , 2014, Middleware.

[7]  Alan Jay Smith,et al.  Characteristics of I/O traffic in personal computer and server workloads , 2002, IBM Syst. J..

[8]  Jialin Li,et al.  Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.

[9]  Guangwen Yang,et al.  Understanding Data Characteristics and Access Patterns in a Cloud Storage System , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[10]  Jie Li,et al.  Early observations on the performance of Windows Azure , 2010, HPDC '10.

[11]  Andreas Bergen,et al.  Client bandwidth: The forgotten metric of online storage providers , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[12]  Jim Gray,et al.  To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem? , 2007, ArXiv.

[13]  Edgar R. Weippl,et al.  Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space , 2011, USENIX Security Symposium.

[14]  Raúl Gracia Tinedo,et al.  Actively Measuring Personal Cloud Storage , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[15]  Jacob R. Lorch,et al.  A five-year study of file-system metadata , 2007, TOS.

[16]  Yunhao Liu,et al.  Towards Network-level Efficiency for Cloud Storage Services , 2014, Internet Measurement Conference.

[17]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[18]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[19]  Pietro Michiardi,et al.  A measurement study of the Wuala on-line storage service , 2012, 2012 IEEE 12th International Conference on Peer-to-Peer Computing (P2P).

[20]  Markku Kojo,et al.  An experimental study of home gateway characteristics , 2010, IMC '10.

[21]  Ben Y. Zhao,et al.  Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services , 2013, Middleware.

[22]  Alex Borges Vieira,et al.  Modeling the Dropbox client behavior , 2014, 2014 IEEE International Conference on Communications (ICC).

[23]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[24]  Prashant J. Shenoy,et al.  Adaptive push-pull: disseminating dynamic web data , 2001, WWW '01.

[25]  Aiko Pras,et al.  Inside dropbox: understanding personal cloud storage services , 2012, Internet Measurement Conference.

[26]  Raul Gracia-Tinedo,et al.  Cloud-as-a-Gift: Effectively Exploiting Personal Cloud Free Accounts via REST APIs , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[27]  Eran Hammer-Lahav,et al.  The OAuth 1.0 Protocol , 2010, RFC.

[28]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.