Fashion Landmark Detection and Category Classification for Robotics

Research on automated, image based identification of clothing categories and fashion landmarks has recently gained significant interest due to its potential impact on areas such as robotic clothing manipulation, automated clothes sorting and recycling, and online shopping. Several public and annotated fashion datasets have been created to facilitate research advances in this direction. In this work, we make the first step towards leveraging the data and techniques developed for fashion image analysis in vision-based robotic clothing manipulation tasks. We focus on techniques that can generalize from large-scale fashion datasets to less structured, small datasets collected in a robotic lab. Specifically, we propose training data augmentation methods such as elastic warping, and model adjustments such as rotation invariant convolutions to make the model generalize better. Our experiments demonstrate that our approach outperforms stateof-the art models with respect to clothing category classification and fashion landmark detection when tested on previously unseen datasets. Furthermore, we present experimental results on a new dataset of images where a robot holds different garments, collected in our lab.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[3]  Yejun Liu,et al.  Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach , 2017, AAAI.

[4]  J. Paul Siebert,et al.  Recognising the clothing categories from free-configuration using Gaussian-Process-based interactive perception , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Carme Torras,et al.  Robot-Aided Cloth Classification Using Depth Information and CNNs , 2016, AMDO.

[6]  Danica Kragic,et al.  Benchmarking Bimanual Cloth Manipulation , 2020, IEEE Robotics and Automation Letters.

[7]  Hong Lu,et al.  Deep Fashion Analysis with Feature Map Upsampling and Landmark-Driven Attention , 2018, ECCV Workshops.

[8]  Masayoshi Kakikura,et al.  Planning strategy for unfolding task of clothes - isolation of clothes from washed mass , 1996, Proceedings of the 35th SICE Annual Conference. International Session Papers.

[9]  Yu Cheng,et al.  Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Carme Torras,et al.  Active garment recognition and target grasping point detection using deep learning , 2018, Pattern Recognit..

[11]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[12]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Ian D. Walker,et al.  A new approach to clothing classification using mid-level layers , 2013, 2013 IEEE International Conference on Robotics and Automation.

[14]  Xiaogang Wang,et al.  Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks , 2017, ACM Multimedia.

[15]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Ruimao Zhang,et al.  DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Václav Hlavác,et al.  Classification of Hanging Garments Using Learned Features Extracted from 3D Point Clouds , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Jian Dong,et al.  Deep Human Parsing with Active Template Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Francesc Moreno-Noguer,et al.  FINDDD: A fast 3D descriptor to characterize textiles for robot manipulation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Francesc Moreno-Noguer,et al.  Characterization of textile grasping experiments , 2012 .

[23]  Vladimír Petrík,et al.  Folding Clothes Autonomously: A Complete Pipeline , 2016, IEEE Transactions on Robotics.

[24]  Nobuyuki Kita,et al.  Clothes state recognition using 3D observed data , 2009, 2009 IEEE International Conference on Robotics and Automation.

[25]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[26]  Xiaogang Wang,et al.  Fashion Landmark Detection in the Wild , 2016, ECCV.

[27]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[28]  Song-Chun Zhu,et al.  Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Peter K. Allen,et al.  Recognition of deformable object category and pose , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[31]  J. Paul Siebert,et al.  Single-shot clothing category recognition in free-configurations with application to autonomous clothes sorting , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Tae-Kyun Kim,et al.  Autonomous active recognition and unfolding of clothes using random decision forests and probabilistic planning , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Liang Chen,et al.  IORN: An Effective Remote Sensing Image Scene Classification Framework , 2018, IEEE Geoscience and Remote Sensing Letters.

[34]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[35]  Hedi Ben-younes,et al.  Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[36]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Tae-Kyun Kim,et al.  Active Random Forests: An Application to Autonomous Unfolding of Clothes , 2014, ECCV.

[38]  Nobuyuki Kita,et al.  A deformable model driven visual method for handling clothes , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[39]  Ioannis Mariolis,et al.  Pose and category recognition of highly deformable objects using deep learning , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[40]  Francesc Moreno-Noguer,et al.  A 3D descriptor to detect task-oriented grasping points in clothing , 2016, Pattern Recognit..

[41]  Francesco Visin,et al.  A guide to convolution arithmetic for deep learning , 2016, ArXiv.

[42]  P. Jiménez,et al.  Visual grasp point localization, classification and state recognition in robotic manipulation of cloth: An overview , 2017, Robotics Auton. Syst..

[43]  Svetlana Lazebnik,et al.  Where to Buy It: Matching Street Clothing Photos in Online Shops , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Qiang Qiu,et al.  Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Shih-Fu Chang,et al.  Real-time pose estimation of deformable objects using a volumetric approach , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  Ioannis Mariolis,et al.  Multi-sensorial and explorative recognition of garments and their material properties in unconstrained environment , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Ian D. Walker,et al.  Model for unfolding laundry using interactive perception , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[49]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.