A constrained latent variable model

Latent variable models provide valuable compact representations for learning and inference in many computer vision tasks. However, most existing models cannot directly encode prior knowledge about the specific problem at hand. In this paper, we introduce a constrained latent variable model whose generated output inherently accounts for such knowledge. To this end, we propose an approach that explicitly imposes equality and inequality constraints on the model's output during learning, thus avoiding the computational burden of having to account for these constraints at inference. Our learning mechanism can exploit non-linear kernels, while only involving sequential closed-form updates of the model parameters. We demonstrate the effectiveness of our constrained latent variable model on the problem of non-rigid 3D reconstruction from monocular images, and show that it yields qualitative and quantitative improvements over several baselines.

[1]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[2]  Nassir Navab,et al.  Monocular Template-Based Reconstruction of Smooth and Inextensible Surfaces , 2010, ACCV.

[3]  Dimitris N. Metaxas,et al.  Constrained deformable superquadrics and nonrigid motion tracking , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[5]  Alex Pentland,et al.  Closed-form solutions for physically-based shape modeling and recognition , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Matthew Brand,et al.  Morphable 3D models from video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Serge J. Belongie,et al.  Re-thinking non-rigid structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Pascal Fua,et al.  Local deformation models for monocular 3D shape recovery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  LiuYuncai,et al.  Monocular 3-D tracking of inextensible deformable surfaces under L2-norm , 2010 .

[10]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Neil D. Lawrence,et al.  Latent Force Models , 2009, AISTATS.

[13]  Raquel Urtasun,et al.  Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation , 2010, NIPS.

[14]  René Vidal,et al.  Perspective Nonrigid Shape and Motion Recovery , 2008, ECCV.

[15]  David J. Fleet,et al.  Gaussian Process Dynamical Models , 2005, NIPS.

[16]  David J. Fleet,et al.  Topologically-constrained latent variable models , 2008, ICML '08.

[17]  Aaron Hertzmann,et al.  Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Nicholas Ayache,et al.  Frequency-Based Nonrigid Motion Analysis: Application to Four Dimensional Medical Images , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jessica K. Hodgins,et al.  Estimating cloth simulation parameters from video , 2003, SCA '03.

[20]  Adrien Bartoli,et al.  Monocular Template-based Reconstruction of Inextensible Surfaces , 2011, International Journal of Computer Vision.

[21]  Alessio Del Bue,et al.  Non-rigid Structure from Motion using Quadratic Deformation Models , 2009, BMVC.

[22]  Jing Xiao,et al.  Uncalibrated perspective reconstruction of deformable structures , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23]  Alessio Del Bue,et al.  Non-rigid metric reconstruction from perspective cameras , 2010, Image Vis. Comput..

[24]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[25]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[26]  Pascal Fua,et al.  Linear Local Models for Monocular Reconstruction of Deformable Surfaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Kiriakos N. Kutulakos,et al.  Semidefinite Programming Heuristics for Surface Reconstruction Ambiguities , 2008, ECCV.

[28]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Thomas S. Huang,et al.  Efficient Highly Over-Complete Sparse Coding Using a Mixture Model , 2010, ECCV.

[30]  Alessio Del Bue,et al.  Piecewise Quadratic Reconstruction of Non-Rigid Surfaces from Monocular Sequences , 2010, ECCV.

[31]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[32]  Dmitry B. Goldgof,et al.  Nonrigid motion analysis based on dynamic refinement of finite element models , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[33]  Demetri Terzopoulos,et al.  A finite element model for 3D shape reconstruction and nonrigid motion tracking , 1993, 1993 (4th) International Conference on Computer Vision.