Crawling Facebook for social network analysis purposes

We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.

[1]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[2]  Stanley Milgram,et al.  An Experimental Study of the Small World Problem , 1969 .

[3]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[4]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[5]  Christopher R. Palmer,et al.  Generating network topologies that obey power laws , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).

[6]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[7]  U. Brandes,et al.  GraphML Progress Report ? Structural Layer Proposal , 2001 .

[8]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[9]  Alessandro Acquisti,et al.  Information revelation and privacy in online social networks , 2005, WPES '05.

[10]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[11]  Steffen Staab,et al.  Social Networks Applied , 2005, IEEE Intell. Syst..

[12]  Ben Shneiderman,et al.  Balancing Systematic and Flexible Exploration of Social Networks , 2006, IEEE Transactions on Visualization and Computer Graphics.

[13]  James A. Hendler,et al.  Inferring binary trust relationships in Web-based social networks , 2006, TOIT.

[14]  Luciano Rossoni,et al.  Models and methods in social network analysis , 2006 .

[15]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[16]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[17]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[18]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[19]  Seungyeop Han,et al.  Analysis of topological characteristics of huge online social networking services , 2007, WWW '07.

[20]  Christos Faloutsos,et al.  Parallel crawling for online social networks , 2007, WWW '07.

[21]  Christos Faloutsos,et al.  Dynamics of large networks , 2008 .

[22]  Marcelo Maia,et al.  Identifying user behavior in online social networks , 2008, SocialNets '08.

[23]  Minas Gjoka,et al.  Poking facebook: characterization of osn applications , 2008, WOSN '08.

[24]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[25]  Ravi Kumar Online social networks: modeling and mining: invited talk , 2009, WSDM '09.

[26]  Anja Feldmann,et al.  Understanding online social network usage from a network perspective , 2009, IMC '09.

[27]  Michael L. Nelson,et al.  What happens when facebook is gone? , 2009, JCDL '09.

[28]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[29]  Shyhtsun Felix Wu,et al.  Crawling Online Social Graphs , 2010, 2010 12th International Asia-Pacific Web Conference.

[30]  Pasquale De Meo,et al.  Analyzing the Facebook Friendship Graph , 2010, ArXiv.

[31]  Athina Markopoulou,et al.  On the bias of BFS (Breadth First Search) , 2010, 2010 22nd International Teletraffic Congress (lTC 22).

[32]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.