A probabilistic approach for discovering authoritative Web pages

The World Wide Web (WWW) is becoming the most important system for delivering information. Search services on the WWW are becoming increasing popular among users because of the huge amount of data available and consequently it is difficult to retrieve and filter it. Several works have argued that traditional term-based search engines are not very useful since the resulting ranking depends on the precision of the user in expressing the query. However, usually, users are unclear about the information they need and so they do not give much thought to query formulation. Moreover, if the query pertains to topics which are abundant on the Web, search services become unusable because of the huge number of pages obtained. For instance, at the time of this work, AltaVista returned more than 18,000,000 pages in reply to the query asking for the documents related to the word "java".