Demystifying Core Ranking in Pinterest Image Search

Pinterest Image Search Engine helps hundreds of millions of users discover interesting content everyday. This motivates us to improve the image search quality by evolving our ranking techniques. In this work, we share how we practically design and deploy various ranking pipelines into Pinterest image search ecosystem. Specifically, we focus on introducing our novel research and study on three aspects: training data, user/image featurization and ranking models. Extensive offline and online studies compared the performance of different models and demonstrated the efficiency and effectiveness of our final launched ranking models.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[3]  Kiyoharu Aizawa,et al.  FoodLog: Multimedia Tool for Healthcare Applications , 2015, IEEE MultiMedia.

[4]  Hongbo Deng,et al.  Ranking Relevance in Yahoo Search , 2016, KDD.

[5]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[6]  Julian J. McAuley,et al.  VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback , 2015, AAAI.

[7]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  Luo Si,et al.  Cascade Ranking for Operational E-commerce Search , 2017, KDD.

[10]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[11]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[12]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[13]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[14]  Otis Gospodnetic,et al.  Lucene in Action, Second Edition: Covers Apache Lucene 3.0 , 2010 .

[15]  Jure Leskovec,et al.  Understanding Behaviors that Lead to Purchasing: A Case Study of Pinterest , 2016, KDD.

[16]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[17]  Jianping Fan,et al.  JustClick: Personalized Image Recommendation via Exploratory Search From Large-Scale Flickr Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Stuart E. Madnick,et al.  A Microcomputer-Based Image Database Management System , 2015, IEEE Transactions on Industrial Electronics.

[19]  Yong Yu,et al.  Viewing Term Proximity from a Different Perspective , 2008, ECIR.

[20]  David Smiley,et al.  Apache Solr 4 Enterprise Search Server , 2015 .

[21]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[23]  Jeff Donahue,et al.  Visual Search at Pinterest , 2015, KDD.

[24]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[25]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[26]  Shipeng Yu,et al.  Designing efficient cascaded classifiers: tradeoff between accuracy and cost , 2010, KDD.

[27]  Stephanie Rogers,et al.  Related Pins at Pinterest: The Evolution of a Real-World Recommender System , 2017, WWW.

[28]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[29]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[30]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[31]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[32]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[34]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[35]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[36]  Rossano Schifanella,et al.  Leveraging User Interaction Signals for Web Image Search , 2016, SIGIR.

[37]  Alan L. Yuille,et al.  Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images , 2016, NIPS.