Optimizing Inter-server Communication for Online Social Networks

Distributed storage systems are the key infrastructures for hosting the user data of large-scale Online Social Networks (OSNs). The amount of inter-server communication is an important scalability indicator for these systems. Data partitioning and replication are two inter-related issues affecting the inter-server traffic caused by user-initiated read and write operations. This paper investigates the problem of minimizing the total inter-server traffic among a cluster of OSN servers through joint partitioning and replication optimization. We propose a Traffic-Optimized Partitioning and Replication (TOPR) method based on an analysis of how replica allocation affects the inter-server communication. Lightweight algorithms are developed to adjust partitioning and replication dynamically according to data read and write rates. Evaluations with real Facebook and Twitter social graphs show that TOPR significantly reduces the inter-server communication compared with state-of-the-art methods.

[1]  Cuong Pham,et al.  S-CLONE: Socially-aware data replication for social networks , 2012, Comput. Networks.

[2]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[3]  S. W. Roberts Control chart tests based on geometric moving averages , 2000 .

[4]  Jun Li,et al.  Cost optimization for Online Social Networks on geo-distributed clouds , 2012, 2012 20th IEEE International Conference on Network Protocols (ICNP).

[5]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[6]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2012, TNET.

[7]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[8]  Haiying Shen,et al.  Selective Data replication for Online Social Networks with Distributed Datacenters , 2013, 2013 21st IEEE International Conference on Network Protocols (ICNP).

[9]  Anja Feldmann,et al.  Understanding online social network usage from a network perspective , 2009, IMC '09.

[10]  Jun Li,et al.  Multi-objective data placement for multi-cloud socially aware services , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[11]  Duc A. Tran Data Storage for Social Networks , 2012 .

[12]  Hai Jin,et al.  Minimizing Inter-Server Communications by Exploiting Self-Similarity in Online Social Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.

[13]  Krishna P. Gummadi,et al.  Defending against large-scale crawls in online social networks , 2012, CoNEXT '12.

[14]  Duc A. Tran Data Storage for Social Networks: A Socially Aware Approach , 2012 .

[15]  Marcos K. Aguilera,et al.  Online Migration for Geo-distributed Storage Systems , 2011, USENIX Annual Technical Conference.

[16]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[17]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[18]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[19]  Indranil Gupta,et al.  Disk Layout Techniques for Online Social Network Data , 2012, IEEE Internet Computing.

[20]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[21]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM 2010.

[23]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[24]  Ben Y. Zhao,et al.  Understanding latent interactions in online social networks , 2010, IMC '10.

[25]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.