Scaling social media applications into geo-distributed clouds

Federation of geo-distributed cloud services is a trend in cloud computing that, by spanning multiple data centers at different geographical locations, can provide a cloud platform with much larger capacities. Such a geo-distributed cloud is ideal for supporting large-scale social media applications with dynamic contents and demands. Although promising, its realization presents challenges on how to efficiently store and migrate contents among different cloud sites and how to distribute user requests to the appropriate sites for timely responses at modest costs. These challenges escalate when we consider the persistently increasing contents and volatile user behaviors in a social media application. By exploiting social influences among users, this paper proposes efficient proactive algorithms for dynamic, optimal scaling of a social media application in a geo-distributed cloud. Our key contribution is an online content migration and request distribution algorithm with the following features: 1) future demand prediction by novelly characterizing social influences among the users in a simple but effective epidemic model; 2) one-shot optimal content migration and request distribution based on efficient optimization algorithms to address the predicted demand; and 3) a Δ(t)-step look-ahead mechanism to adjust the one-shot optimization results toward the offline optimum. We verify the effectiveness of our online algorithm by solid theoretical analysis, as well as thorough comparisons to ready algorithms including the ideal offline optimum, using large-scale experiments with dynamic realistic settings on Amazon Elastic Compute Cloud (EC2).

[1]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM 2010.

[2]  Bernardo A. Huberman,et al.  Trends in Social Media: Persistence and Decay , 2011, ICWSM.

[3]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[4]  Hermann Hellwagner,et al.  Improving Internet Video Streaming Performance by Parallel TCP-Based Request-Response Streams , 2010, 2010 7th IEEE Consumer Communications and Networking Conference.

[5]  Bo Li,et al.  CloudMedia: When Cloud on Demand Meets Video on Demand , 2011, 2011 31st International Conference on Distributed Computing Systems.

[6]  Dan Wang,et al.  Towards understanding the external links of video sharing sites: measurement and analysis , 2010, NOSSDAV '10.

[7]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[8]  Anees Shaikh,et al.  A Cost-Aware Elasticity Provisioning System for the Cloud , 2011, 2011 31st International Conference on Distributed Computing Systems.

[9]  Bo Li,et al.  Cost-Effective Partial Migration of VoD Services to Content Clouds , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[10]  Katherine Guo,et al.  Intra-cloud lightning: Building CDNs in the cloud , 2012, 2012 Proceedings IEEE INFOCOM.

[11]  Borja Sotomayor,et al.  Capacity Leasing in Cloud Systems using the OpenNebula Engine , 2008 .

[12]  Rajkumar Buyya,et al.  InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services , 2010, ICA3PP.

[13]  Haifeng Chen,et al.  Intelligent Workload Factoring for a Hybrid Cloud Computing Model , 2009, 2009 Congress on Services - I.

[14]  Sem C. Borst,et al.  Distributed Caching Algorithms for Content Distribution Networks , 2010, 2010 Proceedings IEEE INFOCOM.

[15]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM '10.

[16]  David L. Black,et al.  Competitive algorithms for replication and migration problems , 1989 .

[17]  John V. Guttag,et al.  Power-demand routing in massive geo-distributed systems , 2010 .

[18]  Cecilia Mascolo,et al.  Track globally, deliver locally: improving content delivery networks by tracking geographic social cascades , 2011, WWW.

[19]  Ke Xu,et al.  Video sharing in online social networks: measurement and analysis , 2012, NOSSDAV '12.

[20]  Nimbula Cloud Operating Intel® Cloud Builders Guide: Cloud Design and Deployment on Intel® Platforms , 2010 .

[21]  Jammalamadaka Introduction to Linear Regression Analysis (3rd ed.) , 2003 .

[22]  Andrew Edmonds,et al.  Open cloud computing interface , 2011 .

[23]  Chuan Wu,et al.  Multi-Channel Live P2P Streaming: Refocusing on Servers , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[24]  Don Towsley,et al.  On MySpace Account Spans and Double Pareto-Like Distribution of Friends , 2010, 2010 INFOCOM IEEE Conference on Computer Communications Workshops.

[25]  Bo Li,et al.  Scaling Social Media Applications Into Geo-Distributed Clouds , 2015, IEEE/ACM Transactions on Networking.

[26]  Bo Li,et al.  A QoS-Based Joint Scheduling and Caching Algorithm for Multimedia Objects , 2004, World Wide Web.

[27]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[28]  David A. Maltz,et al.  Cloudward bound: planning for beneficial migration of enterprise applications to the cloud , 2010, SIGCOMM '10.

[29]  Lixin Gao,et al.  The impact of YouTube recommendation system on video views , 2010, IMC '10.

[30]  Yuval Rabani,et al.  Competitive algorithms for distributed data management (extended abstract) , 1992, STOC '92.

[31]  Chris Rose,et al.  A Break in the Clouds: Towards a Cloud Definition , 2011 .

[32]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[33]  Lifeng Sun,et al.  Propagation-based social-aware replication for social video contents , 2012, ACM Multimedia.

[34]  Lifeng Sun,et al.  Guiding internet-scale video service deployment using microblog-based prediction , 2012, 2012 Proceedings IEEE INFOCOM.

[35]  Roy M. Anderson,et al.  The Population Dynamics of Infectious Diseases: Theory and Applications , 1982, Population and Community Biology.

[36]  Cheng Huang,et al.  Challenges, design and analysis of a large-scale p2p-vod system , 2008, SIGCOMM '08.

[37]  Muli Ben-Yehuda,et al.  The Reservoir model and architecture for open federated cloud computing , 2009, IBM J. Res. Dev..

[38]  Yuval Rabani,et al.  Competitive Algorithms for Distributed Data Management , 1995, J. Comput. Syst. Sci..

[39]  Jiangchuan Liu,et al.  Load-balanced migration of social media to content clouds , 2011, NOSSDAV.