Reinforced Similarity Integration in Image-Rich Information Networks

Social multimedia sharing and hosting websites, such as Flickr and Facebook, contain billions of user-submitted images. Popular Internet commerce websites such as Amazon.com are also furnished with tremendous amounts of product-related images. In addition, images in such social networks are also accompanied by annotations, comments, and other information, thus forming heterogeneous image-rich information networks. In this paper, we introduce the concept of (heterogeneous) image-rich information network and the problem of how to perform information retrieval and recommendation in such networks. We propose a fast algorithm heterogeneous minimum order k-SimRank (HMok-SimRank) to compute link-based similarity in weighted heterogeneous information networks. Then, we propose an algorithm Integrated Weighted Similarity Learning (IWSL) to account for both link-based and content-based similarities by considering the network structure and mutually reinforcing link similarity and feature weight learning. Both local and global feature learning methods are designed. Experimental results on Flickr and Amazon data sets show that our approach is significantly better than traditional methods in terms of both relevance and speed. A new product search and recommendation system for e-commerce has been implemented based on our algorithm.

[1]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[2]  Hideyuki Tamura,et al.  Image database systems: A survey , 1984, Pattern Recognit..

[3]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[4]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[5]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Dániel Fogaras,et al.  Scaling link-based similarity search , 2005, WWW '05.

[8]  Remco C. Veltkamp,et al.  Content-based image retrieval systems: A survey , 2000 .

[9]  Thierry Pun,et al.  Content-based query of image databases: inspirations from text retrieval , 2000, Pattern Recognit. Lett..

[10]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[11]  Arnold W. M. Smeulders,et al.  Real-time bag of words, approximately , 2009, CIVR '09.

[12]  Hermann Ney,et al.  Features for image retrieval: an experimental comparison , 2008, Information Retrieval.

[13]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, The VLDB Journal.

[14]  Philip S. Yu,et al.  LinkClus: efficient clustering via heterogeneous semantic links , 2006, VLDB.

[15]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[16]  Rong Jin,et al.  A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Marco La Cascia,et al.  Unifying Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web , 1999, Comput. Vis. Image Underst..

[18]  Hong-Mei Chen Garcia,et al.  Multimedia Information Systems , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[19]  Hongjun Lu,et al.  ReCoM: reinforcement clustering of multi-type interrelated data objects , 2003, SIGIR.

[20]  Remco C. Veltkamp,et al.  A Survey of Content-Based Image Retrieval Systems , 2002 .

[21]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[22]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[23]  Jon Louis Bentley,et al.  Multidimensional divide-and-conquer , 1980, CACM.

[24]  J.S. Jin,et al.  Fast content-based image retrieval using quasi-Gabor filter and reduction of image feature dimension , 2002, Proceedings Fifth IEEE Southwest Symposium on Image Analysis and Interpretation.

[25]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[26]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Jiebo Luo,et al.  Collection-based sparse label propagation and its application on social group suggestion from photos , 2011, TIST.

[29]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[30]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[31]  Gang Wang,et al.  iRIN: image retrieval in image-rich information networks , 2010, WWW '10.

[32]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[33]  Robert M. Haralick,et al.  Textural features for image database retrieval , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[34]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[35]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[36]  C.-C. Jay Kuo,et al.  Survey on Image Content Analysis, Indexing, and Retrieval Techniques and Status Report of MPEG-7 , 1999 .

[37]  Steve Branson,et al.  Similarity metrics for categorization: From monolithic to category specific , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Peter Stanchev,et al.  Content-Based Image Retrieval Systems , 2001 .

[39]  Tat-Seng Chua,et al.  Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[40]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[41]  Gang Wang,et al.  Object image retrieval by exploiting online knowledge resources , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[43]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[44]  Qinmin Hu,et al.  An Integrated Approach for Medical Image Retrieval through Combining Textual and Visual Features , 2009, CLEF.