Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos

We present a scalable approach to automatically suggest relevant clothing products, given a single image without metadata. We formulate the problem as cross-scenario retrieval: the query is a real-world image, while the products from online shopping catalogs are usually presented in a clean environment. We divide our approach into two main stages: a) Starting from articulated pose estimation, we segment the person area and cluster promising image regions in order to detect the clothing classes present in the query image. b) We use image retrieval techniques to retrieve visually similar products from each of the detected classes. We achieve clothing detection performance comparable to the state-of-the-art on a very recent annotated dataset, while being more than 50 times faster. Finally, we present a large scale clothing suggestion scenario, where the product database contains over one million products.

[1]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[2]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[5]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[6]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Tsuhan Chen,et al.  Clothing cosegmentation for recognizing people , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[9]  Meng Wang,et al.  Predicting occupation via human clothing and contexts , 2011, 2011 International Conference on Computer Vision.

[10]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[11]  Nan Wang,et al.  Who Blocks Who: Simultaneous clothing segmentation for grouping images , 2011, 2011 International Conference on Computer Vision.

[12]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Changsheng Xu,et al.  Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[15]  Yannis Avrithis,et al.  Approximate Gaussian Mixtures for Large Scale Vocabularies , 2012, ECCV.

[16]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[17]  Mark S. Nixon,et al.  Real-Time Semantic Clothing Segmentation , 2012, ISVC.

[18]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .