A Mathematical and Sociological Analysis of Google Search Algorithm

Abstract : Google search algorithm for finding relevant information on-line based on keywords, phrases, links, and webpages is analyzed in the mathematical and sociological settings in this article. We shall first survey mathematical study related to the Google search engine and then present a new analysis for the convergence of the search algorithm and a new update scheme. Next based on sociological knowledge, we propose to use in- and out- linkages as well as use the second order linkages to refine and improve the search algorithm. We use the sociology to justify our proposed improvements and mathematically prove the convergence of these two new search algorithms.

[1]  Stefano Serra Capizzano Jordan Canonical Form of the Google Matrix: A Potential Contribution to the PageRank Computation , 2005, SIAM J. Matrix Anal. Appl..

[2]  Yimin Wei,et al.  On computing PageRank via lumping the Google matrix , 2009 .

[3]  Er-Wei Bai,et al.  PageRank computation via a distributed randomized approach with lossy communication , 2012, Syst. Control. Lett..

[4]  Ilse C. F. Ipsen,et al.  PageRank Computation, with Special Attention to Dangling Nodes , 2007, SIAM J. Matrix Anal. Appl..

[5]  Claude Brezinski,et al.  The PageRank Vector: Properties, Computation, Approximation, and Acceleration , 2006, SIAM J. Matrix Anal. Appl..

[6]  Fan Chung Graham,et al.  A Sharp PageRank Algorithm with Applications to Edge Ranking and Graph Sparsification , 2010, WAW.

[7]  Amy Nicole Langville,et al.  Updating Markov Chains with an Eye on Google's PageRank , 2005, SIAM J. Matrix Anal. Appl..

[8]  Michael Brinkmeier,et al.  PageRank revisited , 2006, TOIT.

[9]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[10]  L. Eldén A Note on the Eigenvalues of the Google Matrix , 2004, math/0401177.

[11]  Claude Brezinski,et al.  Extrapolation methods for PageRank computations , 2005 .

[12]  G. Golub,et al.  An Arnoldi-type algorithm for computing page rank , 2006 .

[13]  Er-Wei Bai,et al.  Distributed randomized pagerank algorithms based on web aggregation over unreliable channels , 2010, 49th IEEE Conference on Decision and Control (CDC).

[14]  Taher H. Haveliwala,et al.  Adaptive methods for the computation of PageRank , 2004 .

[15]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[16]  Tan Yong-ji PageRank algorithm optimization and improvement , 2009 .

[17]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[18]  C. D. Meyer,et al.  Using the QR factorization and group inversion to compute, differentiate ,and estimate the sensitivity of stationary probabilities for markov chains , 1986 .

[19]  Konstantin Avrachenkov,et al.  Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient , 2007, SIAM J. Numer. Anal..

[20]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .

[21]  Ilse C. F. Ipsen,et al.  Mathematical properties and analysis of Google's PageRank , 2008 .

[22]  Fan Chung Graham,et al.  Finding and Visualizing Graph Clusters Using PageRank Optimization , 2010, Internet Math..

[23]  Yin Zhang,et al.  Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm , 2012, Mathematical Programming Computation.

[24]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[25]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[26]  Er-Wei Bai,et al.  Distributed randomized PageRank algorithms over unreliable channels , 2012 .

[27]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[28]  Naresh Kumar,et al.  To Overcome HITS Rank Similarity Confliction of Web Pages using Weight Calculation and Rank Improvement , 2011 .

[29]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[30]  Michele Benzi,et al.  MATRIX FUNCTIONS , 2006 .

[31]  Leandro Tortosa,et al.  An algorithm for ranking the nodes of an urban network based on the concept of PageRank vector , 2012, Appl. Math. Comput..

[32]  W. Cheney,et al.  Numerical analysis: mathematics of scientific computing (2nd ed) , 1991 .

[33]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[34]  Sergei Silvestrov,et al.  The Mathematics of Internet Search Engines , 2008 .