Mix 'n Match: Integrating Text Matching and Product Substitutability within Product Search

Two products are substitutes if both can satisfy the same consumer need. Intrinsic incorporation of product substitutability - where substitutability is integrated within latent vector space models - is in contrast to the extrinsic re-ranking of result lists. The fusion of text matching and product substitutability objectives allows latent vector space models to mix and match regularities contained within text descriptions and substitution relations. We introduce a method for intrinsically incorporating product substitutability within latent vector space models for product search that are estimated using gradient descent; it integrates flawlessly with state-of-the-art vector space models. We compare our method to existing methods for incorporating structural entity relations, where product substitutability is incorporated extrinsically by re-ranking. Our method outperforms the best extrinsic method on four benchmarks. We investigate the effect of different levels of text matching and product similarity objectives, and provide an analysis of the effect of incorporating product substitutability on product search ranking diversity. Incorporating product substitutability information improves search relevance at the cost of diversity.

[1]  Jure Leskovec,et al.  Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[2]  W. Bruce Croft,et al.  Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval , 2016, SIGIR.

[3]  M. de Rijke,et al.  Pyndri: A Python Interface to the Indri Search Engine , 2017, ECIR.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[6]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[7]  J. M. Henderson,et al.  Microeconomic Theory: A Mathematical Approach. , 1959 .

[8]  W. Bruce Croft,et al.  Learning a Hierarchical Embedding Model for Personalized Product Search , 2017, SIGIR.

[9]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[10]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[11]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[12]  Maarten de Rijke,et al.  Semantic Entity Retrieval Toolkit , 2017, ArXiv.

[13]  ChengXiang Zhai,et al.  A probabilistic mixture model for mining and analyzing product search log , 2013, CIKM.

[14]  L. McAlister,et al.  Using a Variety-Seeking Model to Identify Substitute and Complementary Relationships among Competing Products , 1985 .

[15]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[16]  Shubhra Kanti Karmaker Santu,et al.  On Application of Learning to Rank for E-Commerce Search , 2017, SIGIR.

[17]  Oren Kurland The Cluster Hypothesis in Information Retrieval , 2014, ECIR.

[18]  Oren Kurland,et al.  The cluster hypothesis for entity oriented search , 2013, SIGIR.

[19]  J. Rowley Product search in e‐shopping: a review and research propositions , 2000 .

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[22]  Marie-Francine Moens,et al.  Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings , 2015, SIGIR.

[23]  Eemil Lagerspetz,et al.  Product retrieval for grocery stores , 2008, SIGIR '08.

[24]  L. Stein,et al.  OWL Web Ontology Language - Reference , 2004 .

[25]  Oren Kurland,et al.  Corpus structure, language models, and ad hoc information retrieval , 2004, SIGIR '04.

[26]  A. Mas-Colell,et al.  Microeconomic Theory , 1995 .

[27]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[28]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[29]  Emine Yilmaz,et al.  Semi-supervised learning to rank with preference regularization , 2011, CIKM '11.

[30]  Bernard J. Jansen,et al.  The effectiveness of Web search engines for retrieving relevant ecommerce links , 2006, Inf. Process. Manag..

[31]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[32]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[33]  Uzay Kaymak,et al.  Facet selection algorithms for web product search , 2013, CIKM.

[34]  ChengXiang Zhai,et al.  Mining Coordinated Intent Representation for Entity Search and Recommendation , 2015, CIKM.

[35]  M. de Rijke,et al.  Ranking related entities: components and analyses , 2010, CIKM.

[36]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.

[37]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[38]  M. de Rijke,et al.  Neural Vector Spaces for Unsupervised Information Retrieval , 2017, ACM Trans. Inf. Syst..

[39]  Gianluca Demartini,et al.  Combining inverted indices and structured search for ad-hoc object retrieval , 2012, SIGIR '12.

[40]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[41]  Andrew Trotman,et al.  The Architecture of eBay Search , 2017, eCOM@SIGIR.

[42]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[43]  Jiafeng Guo,et al.  Analysis of the Paragraph Vector Model for Information Retrieval , 2016, ICTIR.

[44]  W. Bruce Croft,et al.  Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[45]  M. de Rijke,et al.  Learning Latent Vector Spaces for Product Search , 2016, CIKM.

[46]  Marcel Worring,et al.  Unsupervised, Efficient and Semantic Expertise Retrieval , 2016, WWW.

[47]  ChengXiang Zhai,et al.  Supporting Keyword Search in Product Database: A Probabilistic Approach , 2013, Proc. VLDB Endow..

[48]  Maarten de Rijke,et al.  Structural Regularities in Text-based Entity Vector Spaces , 2017, ICTIR.

[49]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[50]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.