Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set

In this paper, we address a practical problem of cross-scenario clothing retrieval — given a daily human photo captured in general environment, e.g., on street, finding similar clothing in online shops, where the photos are captured more professionally and with clean background. There are large discrepancies between daily photo scenario and online shopping scenario. We first propose to alleviate the human pose discrepancy by locating 30 human parts detected by a well trained human detector. Then, founded on part features, we propose a two-step calculation to obtain more reliable one-to-many similarities between the query daily photo and online shopping photos: 1) the within-scenario one-to-many similarities between a query daily photo and the auxiliary set are derived by direct sparse reconstruction; and 2) by a cross-scenario many-to-many similarity transfer matrix inferred offline from an extra auxiliary set and the online shopping set, the reliable cross-scenario one-to-many similarities between the query daily photo and all online shopping photos are obtained. We collect a large online shopping dataset and a daily photo dataset, both of which are thoroughly labeled with 15 clothing attributes via Mechanic Turk. The extensive experimental evaluations on the collected datasets well demonstrate the effectiveness of the proposed framework for cross-scenario clothing retrieval.

[1]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[2]  G. Haines,et al.  The Relationship between Relative Attributes, Relative Preferences, and Market Share: The Case of Solar Energy in Canada , 1984 .

[3]  Hong Chen,et al.  Composite Templates for Cloth Modeling and Sketching , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Tommaso Gritti,et al.  A framework for robust feature selection for real-time fashion style recommendation , 2009, IMCE '09.

[5]  David A. Forsyth,et al.  Describing objects by their attributes , 2009, CVPR.

[6]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[8]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[9]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[10]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[11]  Basela Hasan,et al.  Segmentation using Deformable Spatial Priors with Application to Clothing , 2010, BMVC.

[12]  Tong Zhang,et al.  Clothes search in consumer photos via color matching and attribute learning , 2011, ACM Multimedia.

[13]  Ming Yang,et al.  Real-time clothing recognition in surveillance videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[14]  Meng Wang,et al.  Predicting occupation via human clothing and contexts , 2011, 2011 International Conference on Computer Vision.

[15]  Subhransu Maji,et al.  Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[16]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[17]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[18]  Nan Wang,et al.  Who Blocks Who: Simultaneous clothing segmentation for grouping images , 2011, 2011 International Conference on Computer Vision.

[19]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.