Result enrichment in commerce search using browse trails

Commerce search engines have become popular in recent years, as users increasingly search for (and buy) products on the web. In response to an user query, they surface links to products in their catalog (or index) that match the requirements specified in the query. Often, few or no product in the catalog matches the user query exactly, and the search engine is forced to return a set of products that partially match the query. This paper considers the problem of choosing a set of products in response to an user query, so as to ensure maximum user satisfaction. We call this the result enrichment problem in commerce search. The challenge in result enrichment is two-fold: the search engine needs to estimate the extent to which a user genuinely cares about an attribute that she has specified in a query; then, it must display products in the catalog that match the user requirement on the important attributes, but have a similar but possibly non-identical value on the less important ones. To this end, we propose a technique for measuring the importance of individual attribute values and the similarity between different values of an attribute. A novelty of our approach is that we use entire browse trails, rather than just clickthrough rates, in this estimation algorithm. We develop a model for this problem, propose an algorithm to solve it, and support our theoretical findings via experiments conducted on actual user data.

[1]  Hang Li,et al.  Named entity recognition in query , 2009, SIGIR.

[2]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[3]  Ryen W. White,et al.  Mining the search trails of surfing crowds: identifying relevant websites from user activity , 2008, WWW.

[4]  YangQiang,et al.  Query enrichment for web-query classification , 2006 .

[5]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[6]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[7]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[8]  Eric Brill,et al.  Improving web search ranking by incorporating user behavior information , 2006, SIGIR.

[9]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[10]  James Allan,et al.  The effect of adding relevance information in a relevance feedback environment , 1994, SIGIR '94.

[11]  Matthew Richardson,et al.  Catching the drift: learning broad matches from clickthrough data , 2009, KDD.

[12]  Qiang Yang,et al.  Query enrichment for web-query classification , 2006, TOIS.

[13]  Jan O. Pedersen,et al.  Phrase recognition and expansion for short, precision-biased queries based on a query log , 1999, SIGIR '99.

[14]  Xiao Li,et al.  Extracting structured information from user queries with semi-supervised conditional random fields , 2009, SIGIR.

[15]  Ricardo A. Baeza-Yates,et al.  Improving search engines by query clustering , 2007, J. Assoc. Inf. Sci. Technol..

[16]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[17]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[18]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[19]  Panayiotis Tsaparas,et al.  Structured annotations of web queries , 2010, SIGMOD Conference.

[20]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[21]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[22]  Paul N. Bennett,et al.  Refined experts: improving classification in large taxonomies , 2009, SIGIR.

[23]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[24]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[25]  Kenneth Ward Church,et al.  Query suggestion using hitting time , 2008, CIKM '08.

[26]  Ryen W. White,et al.  Talking the talk vs. walking the walk: salience of information needs in querying vs. browsing , 2008, SIGIR '08.

[27]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .