Learning Latent Vector Spaces for Product Search

We introduce a novel latent vector space model that jointly learns the latent representations of words, e-commerce products and a mapping between the two without the need for explicit annotations. The power of the model lies in its ability to directly model the discriminative relation between products and a particular word. We compare our method to existing latent vector space models (LSI, LDA and word2vec) and evaluate it as a feature in a learning to rank setting. Our latent vector space model achieves its enhanced performance as it learns better product representations. Furthermore, the mapping from words to products and the representations of words benefit directly from the errors propagated back from the product representations during parameter estimation. We provide an in-depth analysis of the performance of our model and analyze the structure of the learned representations.

[1]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[2]  Houfeng Wang,et al.  Learning Entity Representation for Named Entity Disambiguation. , 2015 .

[3]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[4]  ChengXiang Zhai,et al.  A probabilistic mixture model for mining and analyzing product search log , 2013, CIKM.

[5]  M. de Rijke,et al.  Expertise Retrieval , 2012, Found. Trends Inf. Retr..

[6]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[7]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[8]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[9]  ChengXiang Zhai,et al.  Mining Coordinated Intent Representation for Entity Search and Recommendation , 2015, CIKM.

[10]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[11]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[12]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[13]  Zhiyuan Liu,et al.  Representation Learning for Measuring Entity Relatedness with Rich Information , 2015, IJCAI.

[14]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[15]  J. Rowley Product search in e‐shopping: a review and research propositions , 2000 .

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  Krisztian Balog,et al.  A test collection for entity search in DBpedia , 2013, SIGIR.

[18]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20]  Krisztian Balog,et al.  Overview of the TREC 2010 Entity Track , 2010, TREC.

[21]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  M. de Rijke,et al.  Determining Expert Profiles (With an Application to Expert Finding) , 2007, IJCAI.

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Jianfeng Gao,et al.  Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[27]  Eemil Lagerspetz,et al.  Product retrieval for grocery stores , 2008, SIGIR '08.

[28]  W. Bruce Croft,et al.  Finding experts in community-based question-answering services , 2005, CIKM '05.

[29]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[30]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[31]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[32]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[33]  D. Sculley,et al.  Large Scale Learning to Rank , 2009 .

[34]  Wolfgang Nejdl,et al.  A Vector Space Model for Ranking Entities and Its Application to Expert Search , 2009, ECIR.

[35]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[36]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[37]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[38]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[39]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[40]  Luo Si,et al.  Discriminative models of integrating document evidence and document-candidate associations for expert search , 2010, SIGIR '10.

[41]  Bernard J. Jansen,et al.  The effectiveness of Web search engines for retrieving relevant ecommerce links , 2006, Inf. Process. Manag..

[42]  M. de Rijke,et al.  Formal language models for finding groups of experts , 2016, Inf. Process. Manag..

[43]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[44]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[45]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[46]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[47]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[48]  Marcel Worring,et al.  Unsupervised, Efficient and Semantic Expertise Retrieval , 2016, WWW.

[49]  ChengXiang Zhai,et al.  Supporting Keyword Search in Product Database: A Probabilistic Approach , 2013, Proc. VLDB Endow..

[50]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[51]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[52]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[53]  Elaine Toms,et al.  Overview of the SBS 2015 Interactive Track , 2015, CLEF.

[54]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[55]  Mounia Lalmas,et al.  Overview of the INEX 2007 Entity Ranking Track , 2008, INEX.

[56]  Jure Leskovec,et al.  Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[57]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[58]  Maarten de Rijke,et al.  Dynamic Collective Entity Representations for Entity Ranking , 2016, WSDM '16.