Diversity, Intent, and Aggregated Search

Diversity, intent and aggregated search are three core retrieval concepts that receive significant attention. In search result diversification one typically considers the relevance of a document in light of other retrieved documents. The goal is to identify the probable "aspects" of an ambiguous query, retrieve documents for each of these aspects and make the search results more diverse. By doing so, in the absence of any knowledge of users' context or preferences, the chance that the user will find at least one of these results to be relevant to their underlying information need is increased. Those probable "aspects" of a query may refer to lexical ambiguity (e.g., flash -- Adobe Flash, flash light, flash gordon, flash airlines, flash mob,...) or to intentional ambiguity (e.g., pizza -- how to make one, where to buy one, images, nutritional value, background, restaurant,...). The automatic discovery of query intent has become an active research area, with a range of observational and algorithmic studies as outcomes. Understanding the likely intents behind a query can help search engines to automatically route the query to the corresponding vertical search engines so as to obtain particularly relevant results, thus greatly improving user satisfaction. In aggregated search the task is to search and assemble information from a variety of sources and to organize the resulting material within a single interface. The result page of a modern search engine often goes beyond a simple ranked list. Many specific intents are addressed by aggregated search solutions: specially presented documents, often retrieved from specific sources, that stand out from the regular organic search results.