Measurement and Analysis of the Swarm Social Network With Tens of Millions of Nodes

Social graphs have been widely used for representing the relationship among users in online social networks (OSNs). As crawling an entire OSN is resource- and time-consuming, most of the existing works only pick a sampled subgraph for study. However, this may introduce serious inaccuracy into the analytic results, not to mention that some important metrics cannot even be calculated. In this paper, we crawl the entire social network of Swarm, a leading mobile social app with more than 60 million users, using a distributed approach. Based on the crawled massive user data, we conduct a data-driven study to get a comprehensive picture of the whole Swarm social network. This paper provides a deep analysis of social interactions between Swarm users, and reveals the relationship between social connectivity and check-in activities.

[1]  Qiang Yang,et al.  The Lifecycle and Cascade of WeChat Social Messaging Groups , 2015, WWW.

[2]  Reza Rejaie,et al.  Google+ or Google-?: dissecting the evolution of the new OSN in its first year , 2013, WWW '13.

[3]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[4]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[5]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[6]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[7]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[11]  Aziz Mohaisen,et al.  Trustworthy Distributed Computing on Social Networks , 2013, IEEE Transactions on Services Computing.

[12]  Igor Perisic,et al.  Mapping search relevance to social networks , 2009, SNA-KDD '09.

[13]  Donald F. Towsley,et al.  Estimating and sampling graphs with multidimensional random walks , 2010, IMC '10.

[14]  Haewoon Kwak,et al.  Mining communities in networks: a solution for consistency and its evaluation , 2009, IMC '09.

[15]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[16]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[17]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[19]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[20]  Rong Xie,et al.  Understanding user activity patterns of the Swarm app: a data-driven study , 2017, UbiComp/ISWC Adjunct.

[21]  Seungyeop Han,et al.  Analysis of topological characteristics of huge online social networking services , 2007, WWW '07.

[22]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Ben Y. Zhao,et al.  Understanding latent interactions in online social networks , 2010, IMC '10.

[25]  Derek Greene,et al.  Tracking the Evolution of Communities in Dynamic Social Networks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[26]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[27]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[28]  Long Jin,et al.  Understanding Graph Sampling Algorithms for Social Network Analysis , 2011, 2011 31st International Conference on Distributed Computing Systems Workshops.

[29]  Pedro Casas,et al.  Vivisecting WhatsApp in Cellular Networks: Servers, Flows, and Quality of Experience , 2015, TMA.

[30]  Cecilia Mascolo,et al.  Socio-Spatial Properties of Online Location-Based Social Networks , 2011, ICWSM.

[31]  Patrick P. C. Lee,et al.  Fine-grained dissection of WeChat in cellular networks , 2015, 2015 IEEE 23rd International Symposium on Quality of Service (IWQoS).

[32]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.

[33]  Ben Y. Zhao,et al.  Multi-scale dynamics in a massive online social network , 2012, Internet Measurement Conference.

[34]  Cecilia Mascolo,et al.  An Empirical Study of Geographic User Activity Patterns in Foursquare , 2011, ICWSM.

[35]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[36]  Pan Hui,et al.  Understanding Cross-site Linking in Online Social Networks , 2014, SNAKDD'14.

[37]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[38]  Yun Chi,et al.  Identifying opinion leaders in the blogosphere , 2007, CIKM '07.

[39]  Cecilia Mascolo,et al.  Topological Properties and Temporal Dynamics of Place Networks in Urban Environments , 2015, WWW.

[40]  Arnaud Legout,et al.  Studying social networks at scale: macroscopic anatomy of the twitter social graph , 2014, SIGMETRICS '14.

[41]  Roksana Boreli,et al.  The Where and When of Finding New Friends: Analysis of a Location-based Social Discovery Network , 2013, ICWSM.

[42]  Jun Li,et al.  Optimizing Cost for Online Social Networks on Geo-Distributed Clouds , 2016, IEEE/ACM Transactions on Networking.

[43]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[44]  Ben Y. Zhao,et al.  Scaling Microblogging Services with Divergent Traffic Demands , 2011, Middleware.

[45]  Ling Huang,et al.  Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.

[46]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[47]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[48]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[49]  Christian Haas,et al.  A Social Compute Cloud: Allocating and Sharing Infrastructure Resources via Social Networks , 2014, IEEE Transactions on Services Computing.

[50]  Meeyoung Cha,et al.  Social bootstrapping: how pinterest and last.fm social communities benefit by borrowing links from facebook , 2014, WWW.

[51]  Rok Sosic,et al.  SNAP , 2016, ACM Trans. Intell. Syst. Technol..

[52]  Athanasios V. Vasilakos,et al.  Understanding user behavior in online social networks: a survey , 2013, IEEE Communications Magazine.

[53]  Xiaoming Fu,et al.  Crowd crawling: towards collaborative data collection for large-scale online social networks , 2013, COSN '13.

[54]  Ben Y. Zhao,et al.  On the Embeddability of Random Walk Distances , 2013, Proc. VLDB Endow..

[55]  Jimmy J. Lin,et al.  Information network or social network?: the structure of the twitter follow graph , 2014, WWW.