User-curated image collections: Modeling and recommendation

Most state-of-the-art image retrieval and recommendation systems predominantly focus on individual images. In contrast, socially curated image collections, condensing distinctive yet coherent images into one set, are largely overlooked by the research communities. In this paper, we aim to design a novel recommendation system that can provide users with image collections relevant to individual personal preferences and interests. To this end, two key issues need to be addressed, i.e., image collection modeling and similarity measurement. For image collection modeling, we consider each image collection as a whole in a group sparse reconstruction framework and extract concise collection descriptors given the pretrained dictionaries. We then consider image collection recommendation as a dynamic similarity measurement problem in response to user's clicked image set, and employ a metric learner to measure the similarity between the image collection and the clicked image set. As there is no previous work directly comparable to this study, we implement several competitive baselines and related methods for comparison. The evaluations on a large scale Pinterest data set have validated the effectiveness of our proposed methods for modeling and recommending image collections.

[1]  Shiguang Shan,et al.  Image sets alignment for Video-Based Face Recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[5]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[7]  Saeideh Bakhshi,et al.  "I need to try this"?: a statistical overview of pinterest , 2013, CHI.

[8]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[9]  LiuWei,et al.  Semi-supervised distance metric learning for collaborative image retrieval and clustering , 2010 .

[10]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[11]  James M. Rehg,et al.  Visual Place Categorization: Problem, dataset, and algorithm , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Wei-Ying Ma,et al.  IGroup: web image search results clustering , 2006, MM '06.

[15]  Jianping Fan,et al.  Image collection summarization via dictionary learning for sparse representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Rong Jin,et al.  A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[19]  Jianping Fan,et al.  JustClick: Personalized Image Recommendation via Exploratory Search From Large-Scale Flickr Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[21]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[22]  Jiebo Luo,et al.  Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.

[23]  James M. Rehg,et al.  CENTRIST: A Visual Descriptor for Scene Categorization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[25]  F. Mosteller,et al.  Low Moments for Small Samples: A Comparative Study of Order Statistics , 1947 .

[26]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[28]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[29]  Emmanuel J. Candès,et al.  Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..