Improving the Annotation of DeepFashion Images for Fine-grained Attribute Recognition

DeepFashion is a widely used clothing dataset with 50 categories and more than overall 200k images where each image is annotated with fine-grained attributes. This dataset is often used for clothes recognition and although it provides comprehensive annotations, the attributes distribution is unbalanced and repetitive specially for training fine-grained attribute recognition models. In this work, we tailored DeepFashion for fine-grained attribute recognition task by focusing on each category separately. After selecting categories with sufficient number of images for training, we remove very scarce attributes and merge the duplicate ones in each category, then we clean the dataset based on the new list of attributes. We use a bilinear convolutional neural network with pairwise ranking loss function for multi-label fine-grained attribute recognition and show that the new annotations improve the results for such a task. The detailed annotations for each of the selected categories are provided for public use.

[1]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[5]  Yale Song,et al.  Improving Pairwise Ranking for Multi-label Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .

[10]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Aapo Hyvärinen,et al.  Survey on Independent Component Analysis , 1999 .

[13]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .