Emerging Applications of Link Analysis for Ranking

The booming growth of digitally available information has thoroughly increased the popularity of search engine technology over the past years. At the same time, upon interacting with this overwhelming quantity of data, people usually inspect only the very few most relevant items for their task. It is thus very important to utilize high quality ranking measures which efficiently identify these items under the various information retrieval activities we pursue. In this thesis we provide a twofold contribution to the Information Retrieval field. First, we identify those application areas in which a user oriented ranking is missing, though extremely necessary in order to facilitate a qualitative access to relevant resources. Second, for each of these areas we propose appropriate ranking algorithms which exploit their underlying social characteristics, either at the macroscopic, or at the microscopic level. We achieve this by utilizing link analysis techniques, which build on top of the graph based representation of relations between resources in order to rank them or simply to identify social patterns relative to the investigated data set. We start by arguing that Ranking Desktop Items is very effective in improving resource access within Personal Information Repositories. Thus, we propose to move link analysis methods down to the PC Desktop by exploiting usage analysis statistics, and show the resulted importance ordering to be highly beneficial for the particular scenario of Desktop Search. We then apply the same technique for Spam Detection. We connect people across email social networks based on their email exchanges and induce a reputation metric which nicely isolates malicious members of a community. Similarly, we model several higher level artificial constructs which could negatively manipulate generic link analysis ranking algorithms, and indicate how to remove them in the case of Web page ranking. Finally, we exploit manually created large scale information repositories in order to Personalize Web Search. We investigate two different types of such repositories, namely globally edited ones and individually edited ones. For the former category we project link analysis onto public taxonomies such as the Open Directory and define appropriate similarity measures which order the search output in accordance to each user’s preferences. For the latter one, we propose to expand Web queries by utilizing both text and link analysis on top of Personal Information Repositories. Extensive experiments analyzing both approaches show them to yield significant improvements over regular Google search.

[1]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[2]  F. Menczer,et al.  Personalizing PageRank Based on Domain Profiles , 2004 .

[3]  Wolfgang Nejdl,et al.  Building a Desktop Search Test-Bed , 2007, ECIR.

[4]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Brian D. Davison,et al.  Undue influence: eliminating the impact of link plagiarism on web search rankings , 2006, SAC.

[6]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[7]  Efthimis N. Efthimiadis,et al.  User Choices: A new Yardstick for the Evaluation of Ranking Algorithms for Interactive Query Expansion , 1995, Inf. Process. Manag..

[8]  D. A. Quan,et al.  How to make a semantic web browser , 2004, WWW '04.

[9]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[10]  Hector Garcia-Molina,et al.  Web Spam Taxonomy , 2005, AIRWeb.

[11]  B. J. Winer Statistical Principles in Experimental Design , 1992 .

[12]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[13]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[14]  Eugene Volokh,et al.  Personalization and privacy , 2000, CACM.

[15]  David Gelernter,et al.  Lifestreams: an alternative to the desktop metaphor , 1996, CHI Conference Companion.

[16]  G. Yule The statistical study of literary vocabulary , 1944 .

[17]  W. Hoeffding,et al.  Rank Correlation Methods , 1949 .

[18]  Georg Lausen,et al.  Spreading activation models for trust propagation , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[19]  Richard M. Everson,et al.  When Are Links Useful? Experiments in Text Classification , 2003, ECIR.

[20]  Li Chen,et al.  Survey of Preference Elicitation Methods , 2004 .

[21]  P. Steerenberg,et al.  Targeting pathophysiological rhythms: prednisone chronotherapy shows sustained efficacy in rheumatoid arthritis. , 2010, Annals of the rheumatic diseases.

[22]  Clement T. Yu,et al.  Personalized Web search for improving retrieval effectiveness , 2004, IEEE Transactions on Knowledge and Data Engineering.

[23]  Wolfgang Nejdl,et al.  Finding related hubs and authorities , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).

[24]  Eli Upfal,et al.  Using PageRank to Characterize Web Structure , 2002, COCOON.

[25]  Peter G. Anick,et al.  The paraphrase search assistant: terminological feedback for iterative information seeking , 1999, SIGIR '99.

[26]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[27]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[28]  Bernadette Bouchon-Meunier,et al.  Enhanced web document summarization using hyperlinks , 2003, HYPERTEXT '03.

[29]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[30]  Dániel Fogaras,et al.  Scaling link-based similarity search , 2005, WWW '05.

[31]  K. Sparck Jones,et al.  A Probabilistic Model of Information Retrieval : Development and Status , 1998 .

[32]  Mark S. Ackerman,et al.  The perfect search engine is not enough: a study of orienteering behavior in directed search , 2004, CHI.

[33]  Gordon Bell,et al.  MyLifeBits: fulfilling the Memex vision , 2002, MULTIMEDIA '02.

[34]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[35]  Ricardo A. Baeza-Yates,et al.  Pagerank Increase under Different Collusion Topologies , 2005, AIRWeb.

[36]  Mark Levene,et al.  A stochastic model for the evolution of the Web , 2002, Comput. Networks.

[37]  David Hawking,et al.  Predicting Fame and Fortune: PageRank or Indegree? , 2003 .

[38]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[39]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[40]  Tomonari Kamba,et al.  Learning Personal Preferences on Online Newspaper Articles from User Behaviors , 1997, Comput. Networks.

[41]  Soumen Chakrabarti,et al.  Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction , 2001, WWW '01.

[42]  Siegfried Handschuh,et al.  P-TAG: large scale automatic generation of personalized annotation tags for the web , 2007, WWW '07.

[43]  Amit P. Sheth,et al.  Context-Aware Semantic Association Ranking , 2003, SWDB.

[44]  Wolfgang Nejdl,et al.  Using ODP metadata to personalize search , 2005, SIGIR '05.

[45]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[46]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[47]  Wolfgang Nejdl,et al.  Analyzing User Behavior to Rank Desktop Items , 2006, SPIRE.

[48]  Alon Y. Halevy,et al.  A Platform for Personal Information Management and Integration , 2005, CIDR.

[49]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[50]  Ricardo A. Baeza-Yates,et al.  Web page ranking using link attributes , 2004, WWW Alt. '04.

[51]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[52]  Matthew Richardson,et al.  Trust Management for the Semantic Web , 2003, SEMWEB.

[53]  Paolo Ferragina,et al.  A personalized search engine based on Web‐snippet hierarchical clustering , 2005, WWW '05.

[54]  Leo Sauermann,et al.  Using semantic web technologies to build a semantic desktop , 2003 .

[55]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[56]  Karl Aberer,et al.  Using SiteRank for Decentralized Computation of Web Document Ranking , 2004, AH.

[57]  Said Mirza Pahlevi,et al.  Taxonomy-based adaptive Web search method , 2002, Proceedings. International Conference on Information Technology: Coding and Computing.

[58]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[59]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[60]  Joydeep Ghosh,et al.  Outlink estimation for pagerank computation under missing data , 2004, WWW Alt. '04.

[61]  Gareth O. Roberts,et al.  Downweighting tightly knit communities in world wide web ranking. , 2003 .

[62]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[63]  Taher H. Haveliwala,et al.  Adaptive methods for the computation of PageRank , 2004 .

[64]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[65]  Wolfgang Nejdl,et al.  Pushing task relevant web links down to the desktop , 2006, WIDM '06.

[66]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[67]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[68]  Panayiotis Tsaparas,et al.  Using non-linear dynamical systems for web searching and ranking , 2004, PODS.

[69]  Tie-Yan Liu,et al.  AggregateRank: bringing order to web sites , 2006, SIGIR '06.

[70]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[71]  James C. Browne,et al.  Distributed pagerank for P2P systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[72]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[73]  P. Oscar Boykin,et al.  Leveraging social networks to fight spam , 2005, Computer.

[74]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[75]  Chirag Shah,et al.  Evaluating high accuracy retrieval techniques , 2004, SIGIR '04.

[76]  Wolfgang Nejdl,et al.  MailRank: using ranking for spam detection , 2005, CIKM '05.

[77]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[78]  Eric Brill,et al.  Beyond PageRank: machine learning for static ranking , 2006, WWW '06.

[79]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[80]  Key-Sun Choi,et al.  A Comparison of Collocation-Based Similarity Measures in Query Expansion , 1999, Inf. Process. Manag..

[81]  Larry Kerschberg,et al.  A Personalizable Agent for Semantic Taxonomy-Based Web Search , 2002, WRAC.

[82]  Eli Upfal,et al.  The Web as a graph , 2000, PODS.

[83]  Elad Yom-Tov,et al.  What makes a query difficult? , 2006, SIGIR.

[84]  Brian D. Davison Recognizing Nepotistic Links on the Web , 2000 .

[85]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[86]  Wolfgang Nejdl,et al.  Knowing Where to Search: Personalized Search Strategies for Peers in P2P Networks , 2004, Workshop on Peer-to-Peer Information Retrieval.

[87]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[88]  Eric Freeman,et al.  Lifestreams: Organizing your Electronic Life* , 1995 .

[89]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[90]  Torsten Suel,et al.  I/O-efficient techniques for computing pagerank , 2002, CIKM '02.

[91]  Steve Chien,et al.  Link Evolution: Analysis and Algorithms , 2004, Internet Math..

[92]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[93]  Wolfgang Nejdl,et al.  PROS: A Personalized Ranking Platform for Web Search , 2004, AH.

[94]  Wolfgang Nejdl,et al.  Search strategies for scientific collaboration networks , 2005, P2PIR '05.

[95]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[96]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[97]  Filippo Menczer,et al.  Algorithmic detection of semantic similarity , 2005, WWW '05.

[98]  Chaomei Chen,et al.  Mining the Web: Discovering knowledge from hypertext data , 2004, J. Assoc. Inf. Sci. Technol..

[99]  Reiner Kraft,et al.  Mining anchor text for query refinement , 2004, WWW '04.

[100]  Ricardo A. Baeza-Yates,et al.  Generalizing PageRank: damping functions for link-based ranking algorithms , 2006, SIGIR.

[101]  Taher H. Haveliwala Efficient Encodings for Document Ranking Vectors (Extended Abstract) , 2003, International Conference on Internet Computing.

[102]  David R. Karger,et al.  Haystack: A General-Purpose Information Management Tool for End Users Based on Semistructured Data , 2005, CIDR.

[103]  Marco Gori,et al.  A unified probabilistic framework for Web page scoring systems , 2004, IEEE Transactions on Knowledge and Data Engineering.

[104]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[105]  Hector Garcia-Molina,et al.  Link Spam Alliances , 2005, VLDB.

[106]  David Carmel,et al.  The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.

[107]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[108]  Ingmar Weber,et al.  An Analysis of Factors Used in Search Engine Ranking , 2005, AIRWeb.

[109]  G. Udny Yule,et al.  The statistical study of literary vocabulary , 1944 .

[110]  D. Kossmann,et al.  What can you do with a Web in your Pocket ? , 2007 .

[111]  Wolfgang Nejdl,et al.  Personalized Reputation Management in P2P Networks , 2004, ISWC Workshop on Trust, Security, and Reputation on the Semantic Web.

[112]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[113]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[114]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[115]  Andrei Z. Broder,et al.  A Comparison of Techniques to Find Mirrored Hosts on the WWW , 2000, IEEE Data Eng. Bull..

[116]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[117]  Kenneth J. Arrow,et al.  Information Dynamics in the Networked World , 2003, Inf. Syst. Frontiers.

[118]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[119]  Luca Becchetti,et al.  Using rank propagation and Probabilistic counting for Link-Based Spam Detection , 2006 .

[120]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[121]  Marc Najork,et al.  Spam, damn spam, and statistics: using statistical analysis to locate spam web pages , 2004, WebDB '04.

[122]  Gareth J. F. Jones,et al.  Applying summarization techniques for term selection in relevance feedback , 2001, SIGIR '01.

[123]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[124]  Javed Mostafa,et al.  Detection of shifts in user interests for personalized information filtering , 1996, SIGIR '96.

[125]  Ian Ruthven,et al.  Re-examining the potential effectiveness of interactive query expansion , 2003, SIGIR.

[126]  Wolfgang Nejdl,et al.  Activity Based Metadata for Semantic Desktop Search , 2005, ESWC.

[127]  James Allan,et al.  Using part-of-speech patterns to reduce query ambiguity , 2002, SIGIR '02.

[128]  Louis Cohen,et al.  Statistics for Social Scientists , 1982 .

[129]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[130]  P. Oscar Boykin,et al.  Collaborative Spam Filtering Using E-Mail Networks , 2006, Computer.

[131]  Wolfgang Nejdl,et al.  Summarizing local context to personalize global web search , 2006, CIKM '06.

[132]  Shlomo Moran,et al.  Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs , 2005, Information Retrieval.

[133]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[134]  Eitan Farchi,et al.  Automatic query wefinement using lexical affinities with maximal information gain , 2002, SIGIR '02.

[135]  David R. Karger,et al.  Magnet: supporting navigation in semistructured data environments , 2005, SIGMOD '05.

[136]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[137]  Rudi Studer,et al.  An Approach for the Ranking of Query Results in the Semantic Web , 2003, SEMWEB.

[138]  Leo Sauermann,et al.  Gnowsis Adapter Framework: Treating Structured Data Sources as Virtual RDF Graphs , 2005, SEMWEB.

[139]  Luca Becchetti,et al.  The distribution of pageRank follows a power-law only for particular values of the damping factor , 2006, WWW '06.

[140]  Kevin S. McCurley,et al.  Untangling compound documents on the web , 2003, HYPERTEXT '03.

[141]  David R. Karger,et al.  Haystack: per-user information environments , 1999, CIKM '99.

[142]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[143]  Wolfgang Nejdl,et al.  Designing Semantic Publish/Subscribe Networks Using Super-Peers , 2006, Semantic Web and Peer-to-Peer.

[144]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[145]  Chia-Hui Chang,et al.  Integrating Query Expansion and Conceptual Relevance Feedback for Personalized Web Information Retrieval , 1998, Comput. Networks.

[146]  Alexander Pretschner,et al.  Ontology-based personalized search and browsing , 2003, Web Intell. Agent Syst..

[147]  M W Lansdale,et al.  The psychology of personal information management. , 1988, Applied ergonomics.

[148]  Wei Zhang,et al.  Improvement of HITS-based algorithms on web documents , 2002, WWW '02.

[149]  Wolfgang Nejdl,et al.  Site level noise removal for search engines , 2006, WWW '06.

[150]  Ellen Spertus,et al.  ParaSite: Mining Structural Information on the Web , 1997, Comput. Networks.

[151]  Wei-Ying Ma,et al.  Improving pseudo-relevance feedback in web information retrieval using web page segmentation , 2003, WWW '03.

[152]  Allan Borodin,et al.  Finding authorities and hubs from link structures on the World Wide Web , 2001, WWW '01.

[153]  Mads Haahr,et al.  Personalised, Collaborative Spam Filtering , 2004, CEAS.

[154]  Brian D. Davison,et al.  Identifying link farm spam pages , 2005, WWW '05.

[155]  András A. Benczúr,et al.  SpamRank -- Fully Automatic Link Spam Detection , 2005, AIRWeb.

[156]  Jaime Teevan,et al.  Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[157]  E Garfield,et al.  "Science Citation Index"--A New Dimension in Indexing. , 1964, Science.

[158]  Wolfgang Nejdl,et al.  Semantically Enhanced Searching and Ranking on the Desktop , 2005, Semantic Desktop Workshop.

[159]  Shyhtsun Felix Wu,et al.  On Attacking Statistical Spam Filters , 2004, CEAS.

[160]  Jonathan J. Hull,et al.  Toward Zero-Effort Personal Document Management , 2001, Computer.

[161]  Yuzuru Tanaka,et al.  Topic-oriented query expansion for web search , 2006, WWW '06.

[162]  Claudio Carpineto,et al.  Query Difficulty, Robustness, and Selective Application of Query Expansion , 2004, ECIR.

[163]  Wolfgang Nejdl,et al.  Efficient Parallel Computation of PageRank , 2006, ECIR.

[164]  John A. Tomlin,et al.  A new paradigm for ranking pages on the world wide web , 2003, WWW '03.

[165]  S. Bornholdt,et al.  Scale-free topology of e-mail networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[166]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[167]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[168]  Peter Bailey,et al.  Overview of the TREC-8 Web Track , 2000, TREC.

[169]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[170]  Stephanie Forrest,et al.  Email networks and the spread of computer viruses. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[171]  Guangwen Yang,et al.  Distributed page ranking in structured P2P networks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[172]  Andrzej K. Konopka,et al.  Oligonucleotide Frequencies in DNA Follow a Yule Distribution , 1996, Comput. Chem..

[173]  Philip K. Chan,et al.  Constructing Web User Profiles: A non-invasive Learning Approach , 1999, WEBKDD.

[174]  Karl Aberer,et al.  A Framework for Decentralized Ranking in Web Information Retrieval , 2003, APWeb.

[175]  Chris H. Q. Ding,et al.  PageRank, HITS and a unified framework for link analysis , 2002, SIGIR '02.

[176]  Wolfgang Nejdl,et al.  Finding Related Pages Using the Link Structure of the WWW , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).

[177]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[178]  Luca Becchetti,et al.  Link-Based Characterization and Detection of Web Spam , 2006, AIRWeb.

[179]  David R. Karger,et al.  Haystack: A Platform for Creating, Organizing and Visualizing Information Using RDF , 2002, Semantic Web Workshop.

[180]  Wolfgang Nejdl,et al.  Publish/Subscribe for RDF-based P2P Networks , 2004, ESWS.

[181]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[182]  Hae-Chang Rim,et al.  Information retrieval using word senses: root sense tagging approach , 2004, SIGIR '04.

[183]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[184]  András A. Benczúr,et al.  To randomize or not to randomize: space optimal summaries for hyperlink analysis , 2006, WWW '06.

[185]  James A. Hendler,et al.  Reputation Network Analysis for Email Filtering , 2004, CEAS.

[186]  Jack Park,et al.  IRIS: Integrate. Relate. Infer. Share , 2005, Semantic Desktop Workshop.

[187]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[188]  Douglas W. Oard,et al.  Modeling Information Content Using Observable Behavior , 2001 .

[189]  David Geer Will New Standards Help Curb Spam? , 2004, Computer.

[190]  Bonnie A. Nardi,et al.  Finding and reminding: file organization from the desktop , 1995, SGCH.

[191]  Wolfgang Nejdl,et al.  The Beagle++ Toolbox: Towards an Extendable Desktop Search Architecture , 2006, SemDesk.

[192]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[193]  Allan Borodin,et al.  Perturbation of the Hyper-Linked Environment , 2003, COCOON.

[194]  David J. DeWitt,et al.  Computing PageRank in a Distributed Internet Search Engine System , 2004, VLDB.

[195]  Taher H. Haveliwala Efficient Encodings for Document Ranking Vectors (Extended Abstract) , 2003, International Conference on Internet Computing.

[196]  Giuseppe Attardi,et al.  Automatic Web Page Categorization by Link and Context Analysis , 1999 .

[197]  Kristian J. Hammond,et al.  Watson: Anticipating and Contextualizing Information Needs , 1999 .

[198]  Jasmine Novak,et al.  PageRank Computation and the Structure of the Web: Experiments and Algorithms , 2002 .

[199]  Wolfgang Nejdl,et al.  Preventing shilling attacks in online recommender systems , 2005, WIDM '05.

[200]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[201]  F. Immer,et al.  STATISTICS FOR SOCIAL SCIENTISTS , 1935 .

[202]  Rick Kazman,et al.  WebQuery: Searching and Visualizing the Web Through Connectivity , 1997, Comput. Networks.

[203]  Wolfgang Nejdl,et al.  Desktop Context Detection Using Implicit Feedback , 2006 .

[204]  Amy Nicole Langville,et al.  A Reordering for the PageRank Problem , 2005, SIAM J. Sci. Comput..

[205]  Craig A. N. Soules,et al.  Connections: using context to enhance file search , 2005, SOSP '05.

[206]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[207]  Susan T. Dumais,et al.  Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores , 2003, INTERACT.

[208]  Thomas W. Malone,et al.  How do people organize their desks?: Implications for the design of office information systems , 1983, TOIS.

[209]  E. Garfield,et al.  Citation indexes for science. , 1956, Science.

[210]  Alberto O. Mendelzon,et al.  What is this page known for? Computing Web page reputations , 2000, Comput. Networks.

[211]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[212]  Andrei Z. Broder,et al.  Efficient pagerank approximation via graph aggregation , 2004, WWW Alt. '04.

[213]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[214]  Ellen M. Voorhees,et al.  The TREC robust retrieval track , 2005, SIGF.

[215]  Serge Abiteboul,et al.  Adaptive on-line page importance computation , 2003, WWW '03.

[216]  Mark Claypool,et al.  Inferring User Interest , 2001, IEEE Internet Comput..

[217]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.