Re-Identification Supervised Texture Generation

The estimation of 3D human body pose and shape from a single image has been extensively studied in recent years. However, the texture generation problem has not been fully discussed. In this paper, we propose an end-to-end learning strategy to generate textures of human bodies under the supervision of person re-identification. We render the synthetic images with textures extracted from the inputs and maximize the similarity between the rendered and input images by using the re-identification network as the perceptual metrics. Experiment results on pedestrian images show that our model can generate the texture from a single image and demonstrate that our textures are of higher quality than those generated by other available methods. Furthermore, we extend the application scope to other categories and explore the possible utilization of our generated textures.

[1]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[3]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[4]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[5]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[7]  Victor S. Lempitsky,et al.  Seamless Mosaicing of Image-Based Texture Maps , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Jean-Philippe Pons,et al.  Seamless image-based texture atlases using multi-band blending , 2008, 2008 19th International Conference on Pattern Recognition.

[9]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[10]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[11]  Daniel Cohen-Or,et al.  Seamless Montage for Texturing Models , 2010, Comput. Graph. Forum.

[12]  Cordelia Schmid,et al.  Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Marcus A. Magnor,et al.  Video Based Reconstruction of 3D People Models , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Pascal Monasse,et al.  Multi-view Texturing of Imprecise Mesh , 2009, ACCV.

[17]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Markus H. Gross,et al.  Human Shape from Silhouettes Using Generative HKS Descriptors and Cross-Modal Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Iasonas Kokkinos,et al.  Dense Pose Transfer , 2018, ECCV.

[21]  Xiaogang Wang,et al.  HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Cordelia Schmid,et al.  BodyNet: Volumetric Inference of 3D Human Body Shapes , 2018, ECCV.

[23]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Shaogang Gong,et al.  Person Re-Identification by Deep Joint Learning of Multi-Loss Classification , 2017, IJCAI.

[25]  Peter V. Gehler,et al.  Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation , 2018, 2018 International Conference on 3D Vision (3DV).

[26]  Jian Sun,et al.  AlignedReID: Surpassing Human-Level Performance in Person Re-Identification , 2017, ArXiv.

[27]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Paolo Cignoni,et al.  Masked photo blending: Mapping dense photographic data set on high-resolution sampled 3D models , 2008, Comput. Graph..

[29]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[30]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Michael Goesele,et al.  Let There Be Color! Large-Scale Texturing of 3D Reconstructions , 2014, ECCV.

[32]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[33]  Ignas Budvytis,et al.  Indirect deep structured learning for 3D human body shape and pose prediction , 2017, BMVC.

[34]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Ravi Ramamoorthi,et al.  Patch-based optimization for image-based texture mapping , 2017, ACM Trans. Graph..

[37]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  Vladlen Koltun,et al.  Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[40]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Xiaogang Wang,et al.  End-to-End Deep Learning for Person Search , 2016, ArXiv.

[43]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[44]  Holly E. Rushmeier,et al.  High-Quality Texture Reconstruction from Multiple Scans , 2001, IEEE Trans. Vis. Comput. Graph..

[45]  Jitendra Malik,et al.  Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[46]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Michael J. Black,et al.  OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[48]  Nassir Navab,et al.  Coloured signed distance fields for full 3D object reconstruction , 2014, BMVC.

[49]  Anita Sellent,et al.  Floating Textures , 2008, Comput. Graph. Forum.

[50]  Ersin Yumer,et al.  Self-supervised Learning of Motion Capture , 2017, NIPS.

[51]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[52]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[53]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[55]  Marcus A. Magnor,et al.  Detailed Human Avatars from Monocular Video , 2018, 2018 International Conference on 3D Vision (3DV).

[56]  Xiaowei Zhou,et al.  Learning to Estimate 3D Human Pose and Shape from a Single Color Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Michael J. Black,et al.  Estimating human shape and pose from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[59]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.