Gateway finder in large graphs: problem definitions and fast solutions

Given a graph, how to find a small group of ‘gateways’, that is a small subset of nodes that are crucial in connecting the source to the target? For instance, given a social network, who is the best person to introduce you to, say, Chris Ferguson, the poker champion? Or, given a network of people and skills, who is the best person to help you learn about, say, wavelets? We formally formulate this problem in two scenarios: Pair-Gateway and Group-Gateway. For each scenario, we show that it is sub-modular and thus it can be solved near-optimally. We further give fast, scalable algorithms to find such gateways. Extensive experimental evaluations on real data sets demonstrate the effectiveness and efficiency of the proposed methods.

[1]  Yehuda Koren,et al.  Measuring and extracting proximity in networks , 2006, KDD '06.

[2]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[3]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[4]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[5]  Heikki Mannila,et al.  Relational link-based ranking , 2004, VLDB.

[6]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[7]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[8]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[9]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[11]  Jaideep Srivastava,et al.  Simultaneously Finding Fundamental Articles and New Topics Using a Community Tracking Method , 2009, PAKDD.

[12]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[13]  S. Sudarshan,et al.  BANKS: Browsing and Keyword Searching in Relational Databases , 2002, VLDB.

[14]  Randy Goebel,et al.  Detecting Communities in Social Networks Using Max-Min Modularity , 2009, SDM.

[15]  Pang-Ning Tan,et al.  Recommendation via Query Centered Random Walk on K-Partite Graph , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[16]  G. Karypis,et al.  Multilevel k-way hypergraph partitioning , 1999, Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361).

[17]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[18]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[19]  Christos Faloutsos,et al.  Automatic multimedia cross-modal correlation discovery , 2004, KDD.

[20]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[21]  Jian Pei,et al.  On mining cross-graph quasi-cliques , 2005, KDD '05.

[22]  Deepak Agarwal,et al.  Predictive discrete latent factor models for large scale dyadic data , 2007, KDD '07.

[23]  Christos Faloutsos,et al.  Random walk with restart: fast solutions and applications , 2008, Knowledge and Information Systems.

[24]  George Casella,et al.  Erratum: Inverting a Sum of Matrices , 1990, SIAM Rev..

[25]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[26]  Jiawei Han,et al.  Mining Compressed Frequent-Pattern Sets , 2005, VLDB.

[27]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[28]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[29]  Christos Faloutsos,et al.  Fast best-effort pattern matching in large attributed graphs , 2007, KDD '07.

[30]  Jennifer Neville,et al.  Using relational knowledge discovery to prevent securities fraud , 2005, KDD '05.

[31]  Srinivasan Parthasarathy,et al.  Discovering frequent topological structures from graph datasets , 2005, KDD '05.

[32]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.