Unbiased Sampling of Bipartite Graph

Increasing size of online social networks (OSNs) has given rise to sampling method studies that provide a relatively small but representative sample of large-scale OSNs so that the measurement and analysis burden can be affordable. So far, a number of sampling methods already exist that crawl social graphs. Most of them are suitable for one-mode graph where there is only one type of nodes. Literatures show that Metropolis-Hastings Random Walk (MHRW) produces unbiased samples with better performance than other sampling methods. But there are more and more online social networking sites with two types of nodes, such as Taobao and eBay. Representing these two-mode networks as bipartite graphs, we study the sampling methods for bipartite graphs in this paper. Our contributions include analyze the effectiveness of extending MHRW algorithm to bipartite graphs and making a modification in sampling procedure to improve the stability. Finally, we compare our MHRW sampling algorithm with Random Walk (RW) over the generated bipartite graphs as well as real two-mode network graphs. Simulations show that MHRW outperforms RW over bipartite graphs.

[1]  Shyhtsun Felix Wu,et al.  Estimating the Size of Online Social Networks , 2010, 2010 IEEE Second International Conference on Social Computing.

[2]  Zan Huang BIPARTITE GRAPH SAMPLING METHODS FOR SAMPLING RECOMMENDATION DATA , 2009 .

[3]  Shyhtsun Felix Wu,et al.  Crawling Online Social Graphs , 2010, 2010 12th International Asia-Pacific Web Conference.

[4]  Lillian N. Cassel,et al.  Management of sampled real-time network measurements , 1989, [1989] Proceedings. 14th Conference on Local Computer Networks.

[5]  Marc Najork,et al.  Breadth-First Search Crawling Yields High-Quality Pages , 2001 .

[6]  Seungyeop Han,et al.  Analysis of topological characteristics of huge online social networking services , 2007, WWW '07.

[7]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[8]  Walter Willinger,et al.  Respondent-Driven Sampling for Characterizing Unstructured Overlays , 2009, IEEE INFOCOM 2009.