Clothing-to-words mapping using word separation method

With the development of E-commerce, clothing search on Internet emerges to be a valuable and challenging problem. Compared with the standard image retrieval approach, there are two main difficulties in clothing search. The first is the numerous clothing variation. Another is that people like to search the clothing, which have the same visual elements under the numerous variation. Motivated by Graph Cut method, an approach called word separation method is proposed to map the clothing visual elements to words, which can simultaneously take into account the image-to-image relationship, the image-to-word relationship and the word-to-word relationship. In our work, the meaningful words from web pages are represented by the graph nodes. The graph edges are weighted by the context of data set, which is from Internet. The experimental results on the clothing data set demonstrate the efficiency, effectiveness and robustness of our method.

[1]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[3]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[5]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Fei-Fei Li,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, CVPR.

[7]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[11]  Daphna Weinshall,et al.  Learning distance functions for image retrieval , 2004, CVPR 2004.

[12]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Ouyang Yi Clothes Image Searching System Based on SIFT Features , 2009, 2009 International Conference on E-Business and Information System Security.

[15]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[16]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[17]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Antonio Torralba,et al.  Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[20]  William M. Wells,et al.  SIFT-Rank: Ordinal description for invariant feature correspondence , 2009, CVPR.

[21]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[22]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  D. Bainbridge,et al.  How people describe their image information needs: a grounded theory analysis of visual arts queries , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[24]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.