Web projections: learning from contextual subgraphs of the web

Graphical relationships among Web pages have been exploited inmethods for ranking search results. To date, specific graphicalproperties have been used in these analyses. We introduce a WebProjection methodology that generalizes prior efforts of graphicalrelationships of the web in several ways. With the approach, wecreate subgraphs by projecting sets of pages and domains onto thelarger web graph, and then use machine learning to constructpredictive models that consider graphical properties as evidence. Wedescribe the method and then present experiments that illustrate theconstruction of predictive models of search result quality and userquery reformulation.

[1]  Monika Henzinger,et al.  Finding Related Pages in the World Wide Web , 1999, Comput. Networks.

[2]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[3]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[4]  Jure Leskovec,et al.  Impact of Linguistic Analysis on the Semantic Graph Coverage and Learning of Document Extracts , 2005, AAAI.

[5]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[6]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[7]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[8]  Filip Radlinski,et al.  Improving personalized web search using result diversification , 2006, SIGIR.

[9]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[10]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[11]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[12]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[13]  Gabriel Pinski,et al.  Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics , 1976, Inf. Process. Manag..

[14]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[15]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[16]  William W. Cohen,et al.  Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[17]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[18]  Edward A. Fox,et al.  Link fusion: a unified link analysis framework for multi-type interrelated data objects , 2004, WWW '04.

[19]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[20]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[21]  Sergei Vassilvitskii,et al.  Using web-graph distance for relevance feedback in web search , 2006, SIGIR.

[22]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[23]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[24]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[25]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.