On the approximability of the link building problem

We consider the LINK BUILDING problem, which involves maximizing the PageRank value of a given target vertex in a directed graph by adding k new links that point to the target (backlinks). We present a theorem describing how the topology of the graph affects the choice of potential new backlinks. Based on this theorem we show that no fully polynomial-time approximation scheme (FPTAS) exists for LINK BUILDING unless P=NP and we also show that LINK BUILDING is W[1]-hard. Furthermore, we show that this problem is in the class APX by presenting the polynomial time algorithm r-Greedy, which selects new backlinks in a greedy fashion and results in a new PageRank value for the target vertex that is within a constant factor from the best possible. We also consider the naive algorithm @p-Naive, where we choose backlinks from vertices with high PageRank values compared to the out-degree and show that this algorithm performs much worse on certain graphs compared to our constant factor approximation. Finally, we provide a lower bound for the approximation ratio of our r-Greedy algorithm.

[1]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[2]  Paul Van Dooren,et al.  Maximizing PageRank via outlinks , 2007, ArXiv.

[3]  Jianer Chen,et al.  On Parameterized Intractability: Hardness and Completeness , 2008, Comput. J..

[4]  Isaac Elias,et al.  Settling the Intractability of Multiple Alignment , 2003, ISAAC.

[5]  Steve Chien,et al.  Link Evolution: Analysis and Algorithms , 2004, Internet Math..

[6]  Ulrike Stege,et al.  Solving large FPT problems on coarse-grained parallel machines , 2003, J. Comput. Syst. Sci..

[7]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[8]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[9]  Martin Olsen,et al.  A Constant-Factor Approximation Algorithm for the Link Building Problem , 2010, COCOA.

[10]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[11]  Jerri L. Ledford,et al.  SEO: Search Engine Optimization Bible , 2007 .

[12]  Stéphane Gaubert,et al.  Ergodic Control and Polyhedral Approaches to PageRank Optimization , 2010, IEEE Transactions on Automatic Control.

[13]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[14]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[16]  Leizhen Cai,et al.  Parameterized Complexity of Cardinality Constrained Optimization Problems , 2008, Comput. J..

[17]  Martin Olsen Maximizing PageRank with New Backlinks , 2010, CIAC.

[18]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[19]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[20]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[21]  Konstantin Avrachenkov,et al.  The Effect of New Links on Google Pagerank , 2006 .

[22]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[23]  Tina Eliassi-Rad,et al.  Applying latent dirichlet allocation to group discovery in large graphs , 2009, SAC '09.