Retrieving Similar Styles to Parse Clothing

Clothing recognition is a societally and commercially important yet extremely challenging problem due to large variations in clothing appearance, layering, style, and body shape and pose. In this paper, we tackle the clothing parsing problem using a retrieval-based approach. For a query image, we find similar styles from a large database of tagged fashion images and use these examples to recognize clothing items in the query. Our approach combines parsing from: pre-trained global clothing models, local clothing models learned on the fly from retrieved examples, and transferred parse-masks (Paper Doll item transfer) from retrieved examples. We evaluate our approach extensively and show significant improvements over previous state-of-the-art for both localization (clothing parsing given weak supervision in the form of tags) and detection (general clothing parsing). Our experimental results also indicate that the general pose estimation problem can benefit from clothing parsing.

[1]  Ming Yang,et al.  Real-time clothing recognition in surveillance videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[2]  Pushmeet Kohli,et al.  Simultaneous Segmentation and Pose Estimation of Humans Using Dynamic Graph Cuts , 2008, International Journal of Computer Vision.

[3]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[5]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[8]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[10]  Richard Szeliski,et al.  Finding People in Repeated Shots of the Same Scene , 2006, BMVC.

[11]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[12]  Nan Wang,et al.  Who Blocks Who: Simultaneous clothing segmentation for grouping images , 2011, 2011 International Conference on Computer Vision.

[13]  David J. Kriegman,et al.  From Bikers to Surfers: Visual Recognition of Urban Tribes , 2013, BMVC.

[14]  Ming Shao,et al.  What Do You Do? Occupation Recognition in a Photo via Social Context , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  David J. Kriegman,et al.  Urban tribes: Analyzing group photos from a social perspective , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Jian Dong,et al.  A Deformable Mixture Parsing Model with Parselets , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Martha Larson,et al.  Fashion-focused creative commons social dataset , 2013, MMSys.

[18]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Luc Van Gool,et al.  Apparel Classification with Style , 2012, ACCV.

[21]  Gang Wang,et al.  Modeling fashion , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[22]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[23]  Tamara L. Berg,et al.  Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Cordelia Schmid,et al.  Accurate Object Recognition with Shape Masks , 2012, International Journal of Computer Vision.

[25]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Robinson Piramuthu,et al.  Style Finder: Fine-Grained Clothing Style Detection and Retrieval , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[27]  Mark S. Nixon,et al.  Mobile visual clothing search , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[28]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[29]  Yannis Kalantidis,et al.  Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos , 2013, ICMR.

[30]  Meng Wang,et al.  Predicting occupation via human clothing and contexts , 2011, 2011 International Conference on Computer Vision.

[31]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[32]  Svetlana Lazebnik,et al.  Superparsing , 2010, International Journal of Computer Vision.

[33]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[34]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[35]  Svetlana Lazebnik,et al.  Finding Things: Image Parsing with Regions and Per-Exemplar Detectors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Jitendra Malik,et al.  Shape Guided Object Segmentation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[37]  Shuicheng Yan,et al.  Fashion Parsing With Weak Color-Category Labels , 2014, IEEE Transactions on Multimedia.

[38]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[39]  Tsuhan Chen,et al.  Clothing cosegmentation for recognizing people , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[41]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[42]  Dragomir Anguelov,et al.  Contextual Identity Recognition in Personal Photo Albums , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Martha Larson,et al.  Fashion 10000: an enriched social image dataset for fashion and clothing , 2014, MMSys '14.

[45]  Andrew Zisserman,et al.  Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Changsheng Xu,et al.  Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Michael J. Black,et al.  A 2D Human Body Model Dressed in Eigen Clothing , 2010, ECCV.

[48]  Andrew Zisserman,et al.  Pose search: Retrieving people using their pose , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Josep Lladós,et al.  High-Level Clothes Description Based on Colour-Texture and Structural Features , 2003, IbPRIA.

[50]  VekslerOlga,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[51]  Adriana Kovashka,et al.  WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Subhransu Maji,et al.  Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[53]  Rainer Stiefelhagen,et al.  Part-based clothing segmentation for person retrieval , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[54]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[55]  Alexander C. Berg,et al.  Hipster Wars: Discovering Elements of Fashion Styles , 2014, ECCV.

[56]  Simone Calderara,et al.  A complete system for garment segmentation and color classification , 2013, Machine Vision and Applications.

[57]  Basela Hasan,et al.  Segmentation using Deformable Spatial Priors with Application to Clothing , 2010, BMVC.

[58]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[59]  Ivan Laptev,et al.  Pose Estimation and Segmentation of People in 3D Movies , 2013, 2013 IEEE International Conference on Computer Vision.

[60]  Hong Chen,et al.  Composite Templates for Cloth Modeling and Sketching , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).