Hard-Aware Fashion Attribute Classification

Fashion attribute classification is of great importance to many high-level tasks such as fashion item search, fashion trend analysis, fashion recommendation, etc. The task is challenging due to the extremely imbalanced data distribution, particularly the attributes with only a few positive samples. In this paper, we introduce a hard-aware pipeline to make full use of "hard" samples/attributes. We first propose Hard-Aware BackPropagation (HABP) to efficiently and adaptively focus on training "hard" data. Then for the identified hard labels, we propose to synthesize more complementary samples for training. To stabilize training, we extend semi-supervised GAN by directly deactivating outputs for synthetic complementary samples (Deact). In general, our method is more effective in addressing "hard" cases. HABP weights more on "hard" samples. For "hard" attributes with insufficient training data, Deact brings more stable synthetic samples for training and further improve the performance. Our method is verified on large scale fashion dataset, outperforming other state-of-the-art without any additional supervisions.

[1]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  Kristen Grauman,et al.  Learning the Latent “Look”: Unsupervised Discovery of a Style-Coherent Embedding from Fashion Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[5]  Oliver Wang,et al.  MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis , 2019, ArXiv.

[6]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Noah Snavely,et al.  StreetStyle: Exploring world-wide clothing styles from millions of photos , 2017, ArXiv.

[8]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Igor Kononenko,et al.  Cost-Sensitive Learning with Neural Networks , 1998, ECAI.

[10]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[11]  Hanqing Lu,et al.  Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Francesc Moreno-Noguer,et al.  Neuroaesthetics in fashion: Modeling the perception of fashionability , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Hao Hu,et al.  Global Versus Localized Generative Adversarial Nets , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Shaogang Gong,et al.  Towards Person Identification and Re-identification with Attributes , 2012, ECCV Workshops.

[15]  Kristen Grauman,et al.  Creating Capsule Wardrobes from Fashion Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Yangqing Jia,et al.  Deep Convolutional Ranking for Multilabel Image Annotation , 2013, ICLR.

[18]  Luc Van Gool,et al.  Apparel Classification with Style , 2012, ACCV.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[21]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[22]  Yejun Liu,et al.  Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach , 2017, AAAI.

[23]  Yan-Ying Chen,et al.  Scalable Face Image Retrieval Using Attribute-Enhanced Sparse Codewords , 2013, IEEE Transactions on Multimedia.

[24]  Hedi Ben-younes,et al.  Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[25]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[28]  Carlos D. Castillo,et al.  Doing the Best We Can With What We Have: Multi-Label Balancing With Selective Learning for Attribute Prediction , 2018, AAAI.

[29]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[30]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[31]  Dong Xu,et al.  Imbalanced multi-label learning for identifying antimicrobial peptides and their functional types , 2016, Bioinform..

[32]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[34]  Jiebo Luo,et al.  Who are the Devils Wearing Prada in New York City? , 2015, ACM Multimedia.

[35]  Rainer Stiefelhagen,et al.  Fashion Forward: Forecasting Visual Style in Fashion , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[37]  Song-Chun Zhu,et al.  Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Shaogang Gong,et al.  Imbalanced Deep Learning by Minority Class Incremental Rectification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[42]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[44]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[45]  Jian Dong,et al.  Deep domain adaptation for describing people based on fine-grained clothing attributes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jun Ma,et al.  NeuroStylist: Neural Compatibility Modeling for Clothing Matching , 2017, ACM Multimedia.

[47]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[48]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Hiroshi Ishikawa,et al.  What Makes a Style: Experimental Analysis of Fashion Prediction , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[50]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[51]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[53]  Svetlana Lazebnik,et al.  Where to Buy It: Matching Street Clothing Photos in Online Shops , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54]  Ioannis A. Kakadiaris,et al.  Deep Imbalanced Attribute Classification using Visual Attention Aggregation , 2018, ECCV.

[55]  Chao Zhang,et al.  Hard-Aware Deeply Cascaded Embedding , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[57]  Victor S. Sheng,et al.  Cost-Sensitive Learning , 2009, Encyclopedia of Data Warehousing and Mining.

[58]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[59]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[61]  Gary Weiss,et al.  Does cost-sensitive learning beat sampling for classifying rare classes? , 2005, UBDM '05.