Islands of Interest: Mining Concentrations of User Search Intent over e-Commerce Product Categories

e-Commerce has many core problems which can benefit from data mining—constructing recommendations for users, designing product taxonomies in a way that the user finds easy to navigate to, facility allocation for inventory to minimize shipping costs—to name a few.A big component of e-commerce data comprises of search activity. Search as a share of traffic has overtaken direct browsing, and most e-commerce sites generate more search data than browse data. Search session data is also more voluminous than any user aggregated data since it includes anonymous sessions. However, search data is inherently local to a query. Therefore, it is not immediately obvious whether it can be used to build global (i.e., where no query is involved) knowledge to address many of these problems that are of interest.In this paper, we introduce a global structure, namely islands of interest, that is mined from local search data. We show that these concentrations of user search intent are highly relevant to each of the e-commerce problems mentioned earlier. We introduce two algorithms—one based on community detection, and the other on clustering—that can identify islands of interest. We build a framework that can compare the characteristics of the islands identified using these two approaches. We believe that in addition to providing insights into user behavior, islands of interest can be important in tackling lesser researched problems such as the design of product taxonomies.

[1]  M E Newman,et al.  Scientific collaboration networks. I. Network construction and fundamental results. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Francesco Bonchi,et al.  Query suggestions using query-flow graphs , 2009, WSCD '09.

[3]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Kenneth Ward Church,et al.  Query suggestion using hitting time , 2008, CIKM '08.

[5]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[6]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[7]  Kang Li,et al.  Atypical Queries in eCommerce , 2015, CIKM.

[8]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[9]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[10]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[11]  Lars Schmidt-Thieme,et al.  Taxonomy-driven computation of product recommendations , 2004, CIKM '04.

[12]  David Melamed,et al.  Community Structures in Bipartite Networks: A Dual-Projection Approach , 2014, PloS one.

[13]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[14]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[15]  Michael Kaufmann,et al.  A systematic approach to the one-mode projection of bipartite graphs , 2011, Social Network Analysis and Mining.