Maximising the Quality of Influence

In percolation theory, vertices within a graph have a binary state: either active or inactive. Furthermore, a percolation process decides how activation spreads within the graph. Firstly, we propose and analyse a simple data-driven percolation process in which percolations are preliminarily learnt from a graph with observed percolations. Secondly, we study a problem related to the one solved by Kempe et al. in [1]: given a percolation process, which k vertices should one choose in order to maximise the number of active vertices at the end of process? This question is important in many areas, ranging from viral marketing to the study of epidemic spread. We generalise the problem by considering activations in [0, 1], measuring the “quality” of percolation, and percolation decays along edges in the percolation graph. For a varying cost of activating each vertex, we maximise the total activation whilst keeping within a budget L. The problem can be solved with a greedy algorithm with a guaranteed approximation quality, and furthermore we show its connection to the maximal coverage problem. The resulting algorithm is analysed empirically over predicted percolation graphs on a synthetic dataset and on a real dataset modelling information diffusion within a social network.

[1]  A. Rbnyi ON THE EVOLUTION OF RANDOM GRAPHS , 2001 .

[2]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[3]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[4]  B. Schölkopf,et al.  A Regularization Framework for Learning from Graph Data , 2004, ICML 2004.

[5]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  A. Rapoport,et al.  Connectivity of random nets , 1951 .

[8]  Fabrice Rossi,et al.  Dissemination of Health Information within Social Networks , 2012, ArXiv.

[9]  Yi-Min Huang,et al.  Weighted support vector machine for classification with uneven training class sizes , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[10]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[13]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[14]  Elchanan Mossel,et al.  Submodularity of Influence in Social Networks: From Local to Global , 2010, SIAM J. Comput..

[15]  Reuven Cohen,et al.  The Generalized Maximum Coverage Problem , 2008, Inf. Process. Lett..

[16]  Jiawei Han,et al.  LINKREC: a unified framework for link recommendation with user attributes and graph structure , 2010, WWW '10.

[17]  Zan Huang Link Prediction Based on Graph Topology: The Predictive Value of Generalized Clustering Coefficient , 2010 .

[18]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[19]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[20]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[21]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[22]  Stephen Warshall,et al.  A Theorem on Boolean Matrices , 1962, JACM.

[23]  Éva Tardos,et al.  Influential Nodes in a Diffusion Model for Social Networks , 2005, ICALP.

[24]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[25]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[26]  George L. Nemhauser,et al.  Note--On "Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms" , 1979 .

[27]  Paul Erdös,et al.  On random graphs, I , 1959 .