Monte Carlo methods for Web search

[1]  David M. Pennock,et al.  Methods for Sampling Pages Uniformly from the World Wide Web , 2001 .

[2]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[3]  Joan Feigenbaum,et al.  On graph problems in a semi-streaming model , 2005, Theor. Comput. Sci..

[4]  Steve Chien,et al.  Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.

[5]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Raffaele Giancarlo,et al.  New results for finding common neighborhoods in massive graphs in the data stream model , 2008, Theor. Comput. Sci..

[8]  Gene H. Golub,et al.  Matrix computations , 1983 .

[9]  Monika Henzinger,et al.  Finding Related Pages in the World Wide Web , 1999, Comput. Networks.

[10]  Marc Najork,et al.  Measuring Index Quality Using Random Walks on the Web , 1999, Comput. Networks.

[11]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[12]  András A. Benczúr,et al.  To randomize or not to randomize: space optimal summaries for hyperlink analysis , 2006, WWW '06.

[13]  Johannes Gehrke,et al.  Mining Very Large Databases , 1999, Computer.

[14]  Eyal Kushilevitz,et al.  Communication Complexity , 1997, Adv. Comput..

[15]  Torsten Suel,et al.  Design and implementation of a high-performance distributed Web crawler , 2002, Proceedings 18th International Conference on Data Engineering.

[16]  Ian H. Witten,et al.  Managing gigabytes (2nd ed.): compressing and indexing documents and images , 1999 .

[17]  Sriram Raghavan,et al.  WebBase: a repository of Web pages , 2000, Comput. Networks.

[18]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[19]  Ronald Fagin,et al.  Searching the workplace web , 2003, WWW '03.

[20]  Michael Elkin,et al.  Efficient algorithms for constructing (1+∊,β)-spanners in the distributed and streaming models , 2006, Distributed Computing.

[21]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[22]  Andrew Hume,et al.  Gecko: Tracking a Very Large Billing System , 2000, USENIX Annual Technical Conference, General Track.

[23]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[24]  Micah Adler,et al.  Towards compressing Web graphs , 2001, Proceedings DCC 2001. Data Compression Conference.

[25]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[26]  Anne Rogers,et al.  Hancock: A language for analyzing transactional data streams , 2004, TOPL.

[27]  Jeffrey D. Ullman The MIDAS data-mining project at Stanford , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[28]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures: dealing with massive data , 2001, CSUR.

[29]  C. Papadimitriou,et al.  The complexity of massive data set computations , 2002 .

[30]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[31]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[32]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[33]  Rajiv Ramaswami,et al.  Automatic fault detection, isolation, and recovery in transparent all-optical networks , 1997 .

[34]  Evangelos E. Milios,et al.  Node similarity in networked information spaces , 2001, CASCON.

[35]  Eric A. Brewer,et al.  Lessons from Giant-Scale Services , 2001, IEEE Internet Comput..

[36]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[37]  Qiang Yang,et al.  Scalable collaborative filtering using cluster-based smoothing , 2005, SIGIR '05.

[38]  Marc Najork,et al.  On near-uniform URL sampling , 2000, Comput. Networks.

[39]  Shlomo Moran,et al.  Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs , 2005, Information Retrieval.

[40]  Edith Cohen,et al.  Size-Estimation Framework with Applications to Transitive Closure and Reachability , 1997, J. Comput. Syst. Sci..

[41]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[42]  Andrei Z. Broder,et al.  Sic transit gloria telae: towards an understanding of the web's decay , 2004, WWW '04.

[43]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[44]  Dániel Fogaras,et al.  Scaling link-based similarity search , 2005, WWW '05.

[45]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[46]  Joan Feigenbaum,et al.  Graph distances in the streaming model: the value of space , 2005, SODA '05.

[47]  Prabhakar Raghavan,et al.  Computing on data streams , 1999, External Memory Algorithms.

[48]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..

[49]  Jop F. Sibeyn,et al.  Algorithms for Memory Hierarchies: Advanced Lectures , 2003 .

[50]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[51]  Sebastiano Vigna,et al.  UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..

[52]  Ziv Bar-Yossef,et al.  Reductions in streaming algorithms, with an application to counting triangles in graphs , 2002, SODA '02.

[53]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[54]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[55]  Sepandar D. Kamvar,et al.  An Analytical Comparison of Approaches to Personalizing PageRank , 2003 .

[56]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[57]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[58]  B. Bollobás,et al.  Extremal Graph Theory , 2013 .

[59]  Dmitri Loguinov,et al.  IRLbot: scaling to 6 billion pages and beyond , 2008, WWW.

[60]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[61]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[62]  Ian H. Witten,et al.  Compressing and indexing documents and images , 1999 .

[63]  Sriram Raghavan,et al.  Searching the Web , 2001, ACM Trans. Internet Techn..