Deep Fashion Analysis with Feature Map Upsampling and Landmark-Driven Attention

In this paper, we propose an attentive fashion network to address three problems of fashion analysis, namely landmark localization, category classification and attribute prediction. By utilizing a landmark prediction branch with upsampling network structure, we boost the accuracy of fashion landmark localization. With the aid of the predicted landmarks, a landmark-driven attention mechanism is proposed to help improve the precision of fashion category classification and attribute prediction. Experimental results show that our approach outperforms the state-of-the-arts on the DeepFashion dataset.

[1]  Yannis Kalantidis,et al.  Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos , 2013, ICMR.

[2]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[3]  Ernani Viriato de Melo,et al.  Content-Based Filtering Enhanced by Human Visual Attention Applied to Clothing Recommendation , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[4]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[8]  Yejun Liu,et al.  Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach , 2017, AAAI.

[9]  Xiaogang Wang,et al.  Fashion Landmark Detection in the Wild , 2016, ECCV.

[10]  Song-Chun Zhu,et al.  Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Wen-Huang Cheng,et al.  Learning and Recognition of Clothing Genres From Full-Body Images , 2018, IEEE Transactions on Cybernetics.

[12]  Xuelong Li,et al.  Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement , 2018, Pattern Recognit..

[13]  Xiaogang Wang,et al.  Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks , 2017, ACM Multimedia.

[14]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[15]  Hedi Ben-younes,et al.  Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[16]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Wei Xu,et al.  ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering , 2015, ArXiv.

[18]  Liang Lin,et al.  Clothing Co-parsing by Joint Image Segmentation and Labeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Yu Cheng,et al.  Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).