论文信息 - Type-Ahead Exploratory Search through Typo and Word Order Tolerant Autocompletion

Type-Ahead Exploratory Search through Typo and Word Order Tolerant Autocompletion

There is an increasing interest on recommending to the user instantly (during typing characters) queries and query results. This is evidenced by the emergence of several systems that offer such functionalities, e.g. Google Instant Search for Web searching or Facebook Search for social searching. In this paper we consider showing more rich recommendations that show several other kinds of supplementary information that provide the user with a better overview of the search space. This supplementary information can be the result of various tasks (e.g. textual clustering or entity mining of the top search results), may have very large size and may cost a lot to be derived. The instant presentation of these recommendations (as the user types a query letter-by-letter) helps the user (a) to quickly discover what is popular among other users, (b) to decide fast which (of the suggested) query completions to use, and (c) to decide what hits of the returned answer to inspect. In this paper we focus on making this feasible (scalable) and flexible. Regarding scalability we elaborate on an approach based on precomputed information and we comparatively evaluate various trie-based index structures for making real-time interaction feasible, even if the size of the available memory space is limited. Specifically, we show how with modest hardware (like this of a mobile device) one can provide instant access to large amounts of data. Moreover, we propose and experimentally evaluate an incremental procedure for updating the index. For improving the throughput that can be served we analyze and experimentally evaluate various policies for caching subtries. With regard to flexibility, in order to reduce user's effort and to increase the exploitation of the precomputed information, we elaborate on how the recommendations can tolerate different word orders and spelling errors, assuming the proposed trie-based index structures. The experimental results revealed that such functionality significantly increases the number of recommendations especially for queries that contain several words. Finally, we propose an algorithm for computing the top-K suggestions that exploits the ranking information in order to reduce the trie traversals. An experimental evaluation proves that the proposed algorithm highly improves the retrieval time.

Yannis Tzitzikas | Pavlos Fafalios | Yannis Tzitzikas | P. Fafalios

[1] Torsten Suel,et al. Three-Level Caching for Efficient Query Processing in Large Web Search Engines , 2005, WWW '05.

[2] Torsten Suel,et al. Efficient query processing in large web search engines , 2006 .

[3] Wagner Meira,et al. Rank-preserving two-level caching for scalable search engines , 2001, SIGIR '01.

[4] Guoliang Li,et al. Supporting efficient top-k queries in type-ahead search , 2012, SIGIR '12.

[5] Yannis Tzitzikas,et al. Web Searching with Entity Mining at Query Time , 2012, IRFC.

[6] Ricardo Baeza-Yates,et al. ResIn: a combination of results caching and index pruning for high-performance web search engines , 2008, SIGIR '08.

[7] Giuseppe Ottaviano,et al. Fast Compressed Tries through Path Decompositions , 2011, ALENEX.

[8] Giuseppe Ottaviano,et al. Space-efficient data structures for Top-k completion , 2013, WWW '13.

[9] Guoliang Li,et al. Interactive search in XML data , 2009, WWW '09.

[10] Yinglian Xie,et al. Locality in search engine queries and its implications for caching , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[11] Monica M. C. Schraefel,et al. A longitudinal study of exploratory and keyword search , 2008, JCDL '08.

[12] Torsten Suel,et al. Improved techniques for result caching in web search engines , 2009, WWW '09.

[13] Moni Naor,et al. Optimal aggregation algorithms for middleware , 2001, PODS '01.

[14] Yannis Tzitzikas,et al. Advancing Search Query Autocompletion Services with More and Better Suggestions , 2010, ICWE.

[15] Guoliang Li,et al. Efficient type-ahead search on relational data: a TASTIER approach , 2009, SIGMOD Conference.

[16] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.

[17] Shlomo Moran,et al. Predictive caching and prefetching of query results in search engines , 2003, WWW '03.

[18] Surajit Chaudhuri,et al. Extending autocompletion to tolerate errors , 2009, SIGMOD Conference.

[19] Aristides Gionis,et al. The impact of caching on search engines , 2007, SIGIR.

[20] Ajay Mohindra,et al. Dynamic Scaling of Web Applications in a Virtualized Cloud Computing Environment , 2009, 2009 IEEE International Conference on e-Business Engineering.

[21] Matthew Banta,et al. What do exploratory searchers look at in a faceted search interface? , 2009, JCDL '09.

[22] Andrei Z. Broder,et al. Online expansion of rare queries for sponsored search , 2009, WWW '09.