Algorithmic Challenges in Web Search Engines

In this paper we present the main algorithmic challenges that large Web search engines face today. These challenges are present in all the modules of a Web retrieval system, ranging from the gathering of the data to be indexed (crawling) to the selection and ordering of the answers to a query (searching and ranking). Most of the challenges are ultimately related to the quality of the answer or the efficiency in obtaining it, although some are relevant even to the existence of search engines: context based advertising.

[1]  B. Wellman Computer Networks As Social Networks , 2001, Science.

[2]  Ricardo A. Baeza-Yates,et al.  A Website Mining Model Centered on User Queries , 2005, EWMF/KDO.

[3]  Berthier A. Ribeiro-Neto,et al.  Impedance coupling in content-targeted advertising , 2005, SIGIR '05.

[4]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[5]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[6]  Ricardo A. Baeza-Yates,et al.  Crawling a country: better strategies than breadth-first for web page ordering , 2005, WWW '05.

[7]  Jon M. Kleinberg,et al.  Query incentive networks , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[8]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[9]  Ricardo A. Baeza-Yates,et al.  Information retrieval in the Web: beyond current search engines , 2003, Int. J. Approx. Reason..

[10]  Ricardo A. Baeza-Yates,et al.  A Fast Set Intersection Algorithm for Sorted Sequences , 2004, CPM.

[11]  Ricardo A. Baeza-Yates,et al.  WIM: an information mining model for the Web , 2005, Third Latin American Web Congress (LA-WEB'2005).

[12]  Hemant K. Bhargava,et al.  Paid placement strategies for internet search engines , 2002, WWW '02.

[13]  Scott Nicholson,et al.  How much of it is real? Analysis of paid placement in Web search engine results , 2006 .

[14]  Ricardo A. Baeza-Yates,et al.  Applications of Web Query Mining , 2005, ECIR.

[15]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.