Exploring the Power of Frequent Neighborhood Patterns on Edge Weight Estimation

Since links on social networks model a mixture of many factors, such as acquaintances and friends, the problem of link strength prediction arises: given a social tie e = (u, v) in a social network, how strong the tie e is? Previous work tackles this problem mainly by node profile-based methods, i.e., utilizing users’ profile information. However, some networks do not have node profiles. In this thesis, we study a novel problem of exploring the power of frequent neighborhood patterns on edge weight estimation. Given a labeled graph, we estimate its edge weights by applying its structural information as features. We develop an efficient pattern-growth based mining algorithm to mine frequent neighborhood patterns as features to estimate edge weights. Our experimental results on two real datasets show the efficiency of our method and the effectiveness of the frequent neighborhood pattern based features.

[1]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[2]  Alexander J. Smola,et al.  Friend or frenemy?: predicting signed ties in social networks , 2012, SIGIR '12.

[3]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[4]  Ji-Rong Wen,et al.  Mining frequent neighborhood patterns in a large labeled graph , 2013, CIKM.

[5]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  Jennifer Neville,et al.  Using Transactional Information to Predict Link Strength in Online Social Networks , 2009, ICWSM.

[8]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[9]  Nicola Henze,et al.  Interweaving Public User Profiles on the Web , 2010, UMAP.

[10]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[11]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[12]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[13]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[14]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[15]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[16]  Scott Fortin The Graph Isomorphism Problem , 1996 .

[17]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[18]  Siegfried Nijssen,et al.  What Is Frequent in a Single Graph? , 2007, PAKDD.

[19]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[21]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[22]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[23]  George Karypis,et al.  An efficient algorithm for discovering frequent subgraphs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[24]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[25]  F. Heider Attitudes and cognitive organization. , 1946, The Journal of psychology.

[26]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[27]  Jennifer Neville,et al.  Modeling relationship strength in online social networks , 2010, WWW '10.

[28]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[29]  Bernardo A. Huberman,et al.  Rhythms of social interaction: messaging within a massive online network , 2006, ArXiv.

[30]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[31]  Christian Borgelt,et al.  Subgraph Support in a Single Large Graph , 2007 .

[32]  Ramanathan V. Guha,et al.  Propagation of trust and distrust , 2004, WWW '04.

[33]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[34]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.