Building Roadmaps of Local Minima of Visual Models

Getting trapped in suboptimal local minima is a perennial problem in model based vision, especially in applications like monocular human body tracking where complex nonlinear parametric models are repeatedly fitted to ambiguous image data. We show that the trapping problem can be attacked by building 'roadmaps' of nearby minima linked by transition pathways -- paths leading over low 'cols' or 'passes' in the cost surface, found by locating the transition state (codimension-1 saddle point) at the top of the pass and then sliding downhill to the next minimum. We know of no previous vision or optimization work on numerical methods for locating transition states, but such methods do exist in computational chemistry, where transitions are critical for predicting reaction parameters. We present two families of methods, originally derived in chemistry, but here generalized, clarified and adapted to the needs of model based vision: eigenvector tracking is a modified form of damped Newton minimization, while hypersurface sweeping sweeps a moving hypersurface through the space, tracking minima within it. Experiments on the challenging problem of estimating 3D human pose from monocular images show that our algorithms find nearby transition states and minima very efficiently, but also underline the disturbingly large number of minima that exist in this and similar model based vision problems.

[1]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[2]  Jun-qiang Sun,et al.  Gradient extremals and steepest descent lines on potential energy surfaces , 1993 .

[3]  David J. Fleet,et al.  People tracking using hybrid Monte Carlo filtering , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  T. Helgaker Transition-state optimizations by trust-region image minimization , 1991 .

[5]  L. Davis,et al.  el-based tracking of humans in action: , 1996 .

[6]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[7]  P. Culot,et al.  A quasi-Newton algorithm for first-order saddle-point location , 1992 .

[8]  R. Plankers,et al.  Articulated soft objects for video-based body modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[10]  G M Crippen,et al.  Minimization of polypeptide energy. X. A global search algorithm. , 1971, Archives of biochemistry and biophysics.

[11]  G. Henkelman,et al.  A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives , 1999 .

[12]  N. Russo,et al.  Transition states and energy barriers from density functional studies: Representative isomerization reactions , 1994 .

[13]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[14]  G. Barkema,et al.  Traveling through potential energy landscapes of disordered materials: The activation-relaxation technique , 1997, cond-mat/9710023.

[15]  P. Jørgensen,et al.  Walking on potential energy surfaces , 1983 .

[16]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[17]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[18]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[19]  W. Miller,et al.  ON FINDING TRANSITION STATES , 1981 .

[20]  Lindsey J. Munro,et al.  DEFECT MIGRATION IN CRYSTALLINE SILICON , 1999 .

[21]  J. Simons,et al.  Walking on potential energy surfaces , 1990 .

[22]  H. Scheraga,et al.  Minimization of polypeptide energy. XI. The method of gentlest ascent. , 1971, Archives of biochemistry and biophysics.

[23]  B. Triggs,et al.  A Robust Multiple Hypothesis Approach to Monocular Human Motion Tracking , 2000 .

[24]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[25]  Barkema,et al.  Event-Based Relaxation of Continuous Disordered Systems. , 1996, Physical review letters.

[26]  Josep Maria Bofill,et al.  Updated Hessian matrix and the restricted step method for locating transition structures , 1994, J. Comput. Chem..

[27]  E. Sevick,et al.  A chain of states method for investigating infrequent event processes occurring in multistate, multidimensional systems , 1993 .

[28]  Rlchard L. Hilderbrandt,et al.  Application of Newton-Raphson optimization techniques in molecular mechanics calculations , 1977, Comput. Chem..

[29]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[30]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[32]  Y. Abashkin,et al.  Transition state structures and reaction profiles from constrained optimization procedure. Implementation in the framework of density functional theory , 1994 .

[33]  R. Fletcher Practical Methods of Optimization , 1988 .

[34]  D. Wales Finding saddle points for clusters , 1989 .

[35]  Cristian Sminchisescu,et al.  Hyperdynamics Importance Sampling , 2002, ECCV.

[36]  Cristian Sminchisescu Consistency and coupling in human model likelihoods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[37]  D. Wales,et al.  Theoretical study of the water pentamer , 1996 .

[38]  Alan H. Barr,et al.  Global and local deformations of solid primitives , 1984, SIGGRAPH.

[39]  F. Jensen Locating transition structures by mode following: A comparison of six methods on the Ar8 Lennard‐Jones potential , 1995 .

[40]  Stefano Soatto,et al.  Optimal Structure from Motion: Local Ambiguities and Global Estimates , 2004, International Journal of Computer Vision.

[41]  P. Jørgensen,et al.  A gradient extremal walking algorithm , 1988 .