Building Roadmaps of Minima and Transitions in Visual Models

Becoming trapped in suboptimal local minima is a perennial problem when optimizing visual models, particularly in applications like monocular human body tracking where complicated parametric models are repeatedly fitted to ambiguous image measurements. We show that trapping can be significantly reduced by building ‘roadmaps’ of nearby minima linked by transition pathways—paths leading over low ‘mountain passes’ in the cost surface—found by locating the transition state (codimension-1 saddle point) at the top of the pass and then sliding downhill to the next minimum. We present two families of transition-state-finding algorithms based on local optimization. In eigenvector tracking, unconstrained Newton minimization is modified to climb uphill towards a transition state, while in hypersurface sweeping, a moving hypersurface is swept through the space and moving local minima within it are tracked using a constrained Newton method. These widely applicable numerical methods, which appear not to be known in vision and optimization, generalize methods from computational chemistry where finding transition states is critical for predicting reaction parameters. Experiments on the challenging problem of estimating 3D human pose from monocular images show that our algorithms find nearby transition states and minima very efficiently, but also underline the disturbingly large numbers of minima that can exist in this and similar model based vision problems.

[1]  G M Crippen,et al.  Minimization of polypeptide energy. X. A global search algorithm. , 1971, Archives of biochemistry and biophysics.

[2]  H. Scheraga,et al.  Minimization of polypeptide energy. XI. The method of gentlest ascent. , 1971, Archives of biochemistry and biophysics.

[3]  J. F. Price,et al.  On descent from local minima , 1971 .

[4]  Rlchard L. Hilderbrandt,et al.  Application of Newton-Raphson optimization techniques in molecular mechanics calculations , 1977, Comput. Chem..

[5]  W. Miller,et al.  ON FINDING TRANSITION STATES , 1981 .

[6]  A. Griewank Generalized descent for global optimization , 1981 .

[7]  P. Jørgensen,et al.  Walking on potential energy surfaces , 1983 .

[8]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[9]  Hsi-Jian Lee,et al.  Determination of 3D human body postures from a single view , 1985, Comput. Vis. Graph. Image Process..

[10]  A. V. Levy,et al.  The Tunneling Algorithm for the Global Minimization of Functions , 1985 .

[11]  G. Walsh,et al.  A graphical method for a class of Branin trajectories , 1986 .

[12]  R. Fletcher Practical Methods of Optimization , 1988 .

[13]  P. Jørgensen,et al.  A gradient extremal walking algorithm , 1988 .

[14]  D. Wales Finding saddle points for clusters , 1989 .

[15]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[16]  J. Simons,et al.  Walking on potential energy surfaces , 1990 .

[17]  T. Helgaker Transition-state optimizations by trust-region image minimization , 1991 .

[18]  P. Culot,et al.  A quasi-Newton algorithm for first-order saddle-point location , 1992 .

[19]  Jun-qiang Sun,et al.  Gradient extremals and steepest descent lines on potential energy surfaces , 1993 .

[20]  E. Sevick,et al.  A chain of states method for investigating infrequent event processes occurring in multistate, multidimensional systems , 1993 .

[21]  Josep Maria Bofill,et al.  Updated Hessian matrix and the restricted step method for locating transition structures , 1994, J. Comput. Chem..

[22]  N. Russo,et al.  Transition states and energy barriers from density functional studies: Representative isomerization reactions , 1994 .

[23]  Y. Abashkin,et al.  Transition state structures and reaction profiles from constrained optimization procedure. Implementation in the framework of density functional theory , 1994 .

[24]  F. Jensen Locating transition structures by mode following: A comparison of six methods on the Ar8 Lennard‐Jones potential , 1995 .

[25]  Barkema,et al.  Event-Based Relaxation of Continuous Disordered Systems. , 1996, Physical review letters.

[26]  D. Wales,et al.  Theoretical study of the water pentamer , 1996 .

[27]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  L. Davis,et al.  el-based tracking of humans in action: , 1996 .

[29]  島 孝司 トンネリング・アルゴリズム(Tunneling Algorithm) , 1996 .

[30]  A. Voter A method for accelerating the molecular dynamics simulation of infrequent events , 1997 .

[31]  A. Voter Hyperdynamics: Accelerated Molecular Dynamics of Infrequent Events , 1997 .

[32]  Optimal structure from motion: local ambiguities and global estimates , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[33]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[34]  G. Barkema,et al.  Traveling through potential energy landscapes of disordered materials: The activation-relaxation technique , 1997, cond-mat/9710023.

[35]  Lindsey J. Munro,et al.  DEFECT MIGRATION IN CRYSTALLINE SILICON , 1999 .

[36]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[37]  G. Henkelman,et al.  A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives , 1999 .

[38]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[39]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[40]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[41]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[42]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[43]  Ian D. Reid,et al.  Automatic partitioning of high dimensional search spaces associated with articulated body motion capture , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[44]  David J. Fleet,et al.  People tracking using hybrid Monte Carlo filtering , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[45]  Cristian Sminchisescu,et al.  Hyperdynamics Importance Sampling , 2002, ECCV.

[46]  Cristian Sminchisescu,et al.  Building Roadmaps of Local Minima of Visual Models , 2002, ECCV.

[47]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[48]  Cristian Sminchisescu Consistency and coupling in human model likelihoods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[49]  Cristian Sminchisescu,et al.  Estimation algorithms for ambiguous visual models : Three Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences. (Algorithmes d'estimation pour des modèles visuels ambigus : Modélisation Humaine Tridimensionnelle et Reconstruction du Mouvement dans des Séquences Vidéo Mon , 2002 .

[50]  Geoffrey E. Hinton,et al.  A Mode-Hopping MCMC sampler , 2003 .

[51]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[52]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[53]  C. Sminchisescu,et al.  Variational mixture smoothing for non-linear dynamical systems , 2004, CVPR 2004.

[54]  Cristian Sminchisescu,et al.  Generative modeling for continuous non-linearly embedded visual inference , 2004, ICML.