Recommendation Subgraphs for Web Discovery

Recommendations are central to the utility of many popular e-commerce websites. Such sites typically contain a set of recommendations on every product page that enables visitors and crawlers to easily navigate the website. These recommendations are essentially universally present on all e-commerce websites. Choosing an appropriate set of recommendations at each page is a critical task performed by dedicated backend software systems. We formalize the concept of recommendations used for discovery as a natural graph optimization problem on a bipartite graph and propose three methods for solving the problem in increasing order of sophistication: a local random sampling algorithm, a greedy algorithm and a more involved partitioning based algorithm. We first theoretically analyze the performance of these three methods on random graph models and characterize when each method will yield a solution of sufficient quality and the parameter ranges when more sophistication is needed. We complement this by roviding an empirical analysis of these algorithms on simulated and real-world production data from a retail website. Our results confirm that it is not always necessary to implement complicated algorithms in the real-world, and demonstrate that very good practical results can be obtained by using simple heuristics that are backed by the confidence of concrete theoretical guarantees.

[1]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[2]  Chris Anderson,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .

[3]  Ran Duan,et al.  Approximating Maximum Weight Matching in Near-Linear Time , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[4]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[5]  Yehuda Koren,et al.  The BellKor solution to the Netflix Prize , 2007 .

[6]  Christos Koufogiannakis,et al.  Distributed Fractional Packing and Maximum Weighted b-Matching via Tail-Recursive Duality , 2009, DISC.

[7]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[8]  Purnamrita Sarkar,et al.  Theoretical Justification of Popular Link Prediction Heuristics , 2011, IJCAI.

[9]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[10]  Richard M. Karp,et al.  An optimal algorithm for on-line bipartite matching , 1990, STOC '90.

[11]  William Nzoukou,et al.  A Survey Paper on Recommender Systems , 2010, ArXiv.

[12]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[13]  Eric A. Brewer,et al.  Analysis of WWW traffic in Cambodia and Ghana , 2006, WWW '06.

[14]  Harold N. Gabow,et al.  An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems , 1983, STOC.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16]  Anne Auger,et al.  Theory of Randomized Search Heuristics: Foundations and Recent Developments , 2011, Theory of Randomized Search Heuristics.

[17]  A. M. Madni,et al.  Recommender systems in e-commerce , 2014, 2014 World Automation Congress (WAC).

[18]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[19]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[20]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[21]  Lada A. Adamic,et al.  Internet: Growth dynamics of the World-Wide Web , 1999, Nature.

[22]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[23]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[24]  Svante Janson,et al.  Random graphs , 2000, ZOR Methods Model. Oper. Res..

[25]  M. Hitt The Long Tail: Why the Future of Business Is Selling Less of More , 2007 .

[26]  John Hannon,et al.  Recommending twitter users to follow using content and collaborative filtering approaches , 2010, RecSys '10.

[27]  M. Hart The Long Tail: Why the Future of Business Is Selling Less of More by Chris Anderson , 2007 .

[28]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[29]  Yi Sun,et al.  Location and time do matter: A long tail study of website requests , 2009, Decis. Support Syst..

[30]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[31]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .