hNet: Single-shot 3D shape reconstruction using structured light and h-shaped global guidance network

Abstract Retrieving three-dimensional (3D) shape information from a single two-dimensional (2D) image has recently gained enormous attention in a variety of fields. In spite of recent advancements in algorithms and hardware developments, the easy-to-use characteristics and the accuracy of the 3D shape reconstruction are always of great interest. This paper presents a robust 3D shape reconstruction technique that integrates structured-light 3D imaging scheme with deep convolutional neural network (CNN) learning. The structured-light patterns facilitate the featuring process while the CNN modeling surpasses the complexity of the traditional 3D shape reconstructions. In the supervised learning pipeline, the input is either a single fringe-pattern or a single speckle-pattern image, and the output is its corresponding high-accuracy 3D shape label. Unlike the well-received autoencoder-based CNN model, a global guidance network path with multi-scale feature fusion is introduced into the CNN model to improve the accuracy of the 3D shape reconstruction. Experimental evaluations have been conducted to demonstrate the validity and robustness of the proposed technique, which provides a promising tool for ever-increasing scientific research and engineering applications.

[1]  Jason Geng,et al.  Structured-light 3D surface imaging: a tutorial , 2011 .

[2]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[3]  Minh Vo,et al.  Accurate 3D shape measurement of multiple separate objects with stereo vision , 2014 .

[4]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Li Xu,et al.  Hierarchical Image Saliency Detection on Extended CSSD , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Caiming Zhang,et al.  Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network , 2019, Optics and Lasers in Engineering.

[7]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  Nikolaos Doulamis,et al.  Deep Learning for Computer Vision: A Brief Review , 2018, Comput. Intell. Neurosci..

[10]  Andrew Bud,et al.  Facing the future: the impact of Apple FaceID , 2018 .

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[13]  François Blais Review of 20 years of range sensor development , 2004, J. Electronic Imaging.

[14]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[15]  Zhiliang Ma,et al.  A review of 3D reconstruction techniques in civil engineering and their applications , 2018, Adv. Eng. Informatics.

[16]  Edmund Y. Lam,et al.  Fringe Pattern Improvement and Super-Resolution Using Deep Learning in Digital Holography , 2019, IEEE Transactions on Industrial Informatics.

[17]  Zhaoyang Wang,et al.  Digital image correlation in experimental mechanics and image registration in computer vision: Similarities, differences and complements , 2015 .

[18]  Thomas Brox,et al.  What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Fabio Remondino,et al.  Image‐based 3D Modelling: A Review , 2006 .

[20]  Zhaoyang Wang,et al.  Real-time, high-accuracy 3D imaging and shape measurement. , 2015, Applied optics.

[21]  Yuzeng Wang,et al.  Three-dimensional Shape Reconstruction from Single-shot Speckle Image Using Deep Convolutional Neural Networks , 2021 .

[22]  Huchuan Lu,et al.  A Stagewise Refinement Model for Detecting Salient Objects in Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Anders Grunnet-Jepsen,et al.  Intel(R) RealSense(TM) Stereoscopic Depth Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[25]  Beiwen Li,et al.  Fringe projection profilometry by conducting deep learning from its digital twin. , 2020, Optics express.

[26]  Yury Vizilter,et al.  Deep Learning of Convolutional Auto-Encoder for Image Matching and 3D Object Reconstruction in the Infrared Range , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[27]  Nassir Navab,et al.  Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[28]  Toi Van Vo,et al.  Characterization of healthy and nonmelanoma-induced mouse utilizing the Stokes–Mueller decomposition , 2018, Journal of biomedical optics.

[29]  Nikolaos Doulamis,et al.  FAST-MDL: Fast Adaptive Supervised Training of multi-layered deep learning models for consistent object tracking and classification , 2016, 2016 IEEE International Conference on Imaging Systems and Techniques (IST).

[30]  Yi Zhang,et al.  Dynamic 3-D measurement based on fringe-to-fringe transformation using deep learning. , 2020, Optics express.

[31]  S. Boukhtache,et al.  When Deep Learning Meets Digital Image Correlation , 2020, Optics and Lasers in Engineering.

[32]  Guosheng Lin,et al.  Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Junchao Zhang,et al.  Deep Convolutional Neural Network Phase Unwrapping for Fringe Projection 3D Imaging , 2020, Sensors.

[34]  Terrence J Sejnowski,et al.  The unreasonable effectiveness of deep learning in artificial intelligence , 2020, Proceedings of the National Academy of Sciences.

[35]  Weihong Deng,et al.  Very deep convolutional neural network based image classification using small training sample size , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[36]  Hieu Nguyen,et al.  Real-time 3D shape measurement using 3LCD projection and deep machine learning. , 2019, Applied optics.

[37]  Dung A. Nguyen,et al.  Some practical considerations in fringe projection profilometry , 2010 .

[38]  Sam Van der Jeught,et al.  Deep neural networks for single shot structured light profilometry. , 2019, Optics express.

[39]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Giovanna Sansoni,et al.  State-of-The-Art and Applications of 3D Imaging Sensors in Industry, Cultural Heritage, Medicine, and Criminal Investigation , 2009, Sensors.

[41]  Zhaoyang Wang,et al.  Hyper-accurate flexible calibration technique for fringe-projection-based three-dimensional imaging , 2012 .

[42]  Nathan Silberman,et al.  Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[43]  Song Zhang,et al.  High-speed 3D shape measurement with structured light methods: A review , 2018, Optics and Lasers in Engineering.

[44]  Silvio Savarese,et al.  Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[45]  Mumin Song,et al.  Overview of three-dimensional shape measurement using optical methods , 2000 .

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Yasuyuki Matsushita,et al.  High-quality shape from multi-view stereo and shading under general illumination , 2011, CVPR 2011.

[48]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Qican Zhang,et al.  Dynamic 3-D shape measurement method: A review , 2010 .

[50]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[51]  Zhaoyang Wang,et al.  Single-Shot 3D Shape Reconstruction Using Structured Light and Deep Convolutional Neural Networks , 2020, Sensors.

[52]  Jun Li,et al.  Im2Struct: Recovering 3D Shape Structure from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[54]  VincentPascal,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010 .

[55]  Zichen Zhang,et al.  U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection , 2020, Pattern Recognit..

[56]  Bing Zhao,et al.  3D shape, deformation, and vibration measurements using infrared Kinect sensors and digital image correlation. , 2017, Applied optics.

[57]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Liang Lu,et al.  Three-Dimensional Reconstruction from Single Image Base on Combination of CNN and Multi-Spectral Photometric Stereo , 2018, Sensors.

[59]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Jianmin Jiang,et al.  A Simple Pooling-Based Design for Real-Time Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[62]  Alexandros Iosifidis,et al.  Deep learning and computer vision will transform entomology , 2020, Proceedings of the National Academy of Sciences.

[63]  Liang Zhang,et al.  Fringe pattern analysis using deep learning , 2018, Advanced Photonics.

[64]  Bo Yang,et al.  3D Object Reconstruction from a Single Depth View with Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[65]  Lizhen Wang,et al.  DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs , 2018, ECCV.

[66]  Hieu Nguyen,et al.  Accuracy assessment of fringe projection profilometry and digital image correlation techniques for three-dimensional shape measurements , 2020 .

[67]  Meng Wang,et al.  A Deep Structured Model with Radius–Margin Bound for 3D Human Activity Recognition , 2015, International Journal of Computer Vision.

[68]  Anand Asundi,et al.  Fringe pattern denoising based on deep learning , 2019, Optics Communications.

[69]  Xiaogang Wang,et al.  DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Laura M. Heiser,et al.  How Machine Learning Will Transform Biomedicine , 2020, Cell.

[71]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.