Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation

The shape of an object is an important characteristic for many vision problems such as segmentation, detection and tracking. Being independent of appearance, it is possible to generalize to a large range of objects from only small amounts of data. However, shapes represented as silhouette images are challenging to model due to complicated likelihood functions leading to intractable posteriors. In this paper we present a generative model of shapes which provides a low dimensional latent encoding which importantly resides on a smooth manifold with respect to the silhouette images. The proposed model propagates uncertainty in a principled manner allowing it to learn from small amounts of data and providing predictions with associated uncertainty. We provide experiments that show how our proposed model provides favorable quantitative results compared with the state-of-the-art while simultaneously providing a representation that resides on a low-dimensional interpretable manifold.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[3]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[4]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[5]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[6]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  Ian D. Reid,et al.  PWP3D: Real-time Segmentation and Tracking of 3D Objects , 2009, BMVC.

[8]  Jan Kautz,et al.  Learning a manifold of fonts , 2014, ACM Trans. Graph..

[9]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[12]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[13]  Iasonas Kokkinos,et al.  Semantic Part Segmentation with Deep Learning , 2015, ArXiv.

[14]  Christopher J. Taylor,et al.  Statistical models of shape - optimisation and evaluation , 2008 .

[15]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[16]  Ross T. Whitaker,et al.  ShapeOdds: Variational Bayesian Learning of Generative Shape Models , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[18]  Christopher K. I. Williams,et al.  The Shape Boltzmann Machine: A Strong Model of Object Shape , 2012, International Journal of Computer Vision.

[19]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[20]  Ian D. Reid,et al.  Nonlinear shape manifolds as shape priors in level set segmentation and tracking , 2011, CVPR 2011.

[21]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Anton Osokin,et al.  Deep Part-Based Generative Shape Model with Latent Variables , 2016, BMVC.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[26]  Neil D. Lawrence,et al.  Metrics for Probabilistic Geometries , 2014, UAI.

[27]  S. Tsogkas,et al.  Deep Learning for Semantic Part Segmentation with High-Level Guidance , 2015 .

[28]  Christopher K. I. Williams,et al.  A Generative Model for Parts-based Object Segmentation , 2012, NIPS.

[29]  Jan Kautz,et al.  Interactive Sketch‐Driven Image Synthesis , 2015, Comput. Graph. Forum.

[30]  Sinisa Todorovic,et al.  Combining Bottom-Up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Wenbin Li,et al.  Roto++ , 2016, ACM Trans. Graph..

[32]  Jasper Snoek,et al.  Nonparametric guidance of autoencoder representations using label information , 2012, J. Mach. Learn. Res..

[33]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[34]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.