Representative Fashion Feature Extraction by Leveraging Weakly Annotated Online Resources

We propose a method to extract representative features for fashion analysis by utilizing weakly annotated online fashion images in this work. The proposed system consists of two stages. In the first stage, we attempt to detect clothing items in a fashion image: the top clothes (t), bottom clothes (b) and one-pieces (o). In the second stage, we extract discriminative features from detected regions for various applications of interest. Unlike previous work that heavily relies on well-annotated fashion data, we propose a way to collect fashion images from online resources and conduct automatic annotation on them. Based on this methodology, we create a new fashion dataset, called the Web Attributes, to train our feature extractor. It is shown by experiments that extracted regional features can capture local characteristics of fashion images well and offer better performance than previous works.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[5]  Tamara L. Berg,et al.  Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Alexander C. Berg,et al.  Hipster Wars: Discovering Elements of Fashion Styles , 2014, ECCV.

[7]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[8]  Francesc Moreno-Noguer,et al.  Neuroaesthetics in fashion: Modeling the perception of fashionability , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[10]  Jian Dong,et al.  Deep Human Parsing with Active Template Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[12]  Jian Dong,et al.  Deep domain adaptation for describing people based on fine-grained clothing attributes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xiaogang Wang,et al.  Fashion Landmark Detection in the Wild , 2016, ECCV.

[18]  Hiroshi Ishikawa,et al.  Fashion Style in 128 Floats: Joint Ranking and Classification Using Weak Data for Feature Extraction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shaogang Gong,et al.  Multi-task Curriculum Transfer Deep Learning of Clothing Attributes , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  Xiaodan Liang,et al.  Human Parsing with Contextualized Convolutional Neural Network. , 2017, IEEE transactions on pattern analysis and machine intelligence.