Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

Recent research on structure and motion recovery has focused on issues related to sensitivity and robustness of existing techniques. One possible reason is that in practical applications, the underlying assumptions made by existing algorithms are often violated. In this paper, we propose a framework for 3D reconstruction from short monocular video sequences taking into account the statistical errors in reconstruction algorithms. Detailed error analysis is especially important for this problem because the motion between pairs of frames is small and slight perturbations in its estimates can lead to large errors in 3D reconstruction. We focus on the following issues: physical sources of errors, their experimental and theoretical analysis, robust estimation techniques and measures for characterizing the quality of the final reconstruction. We derive a precise relationship between the error in the reconstruction and the error in the image correspondences. The error analysis is used to design a robust, recursive multi-frame fusion algorithm using “stochastic approximation” as the framework since it is capable of dealing with incomplete information about errors in observations. Rate-distortion analysis is proposed for evaluating the quality of the final reconstruction as a function of the number of frames and the error in the image correspondences. Finally, to demonstrate the effectiveness of the algorithm, examples of depth reconstruction are shown for different video sequences.

[1]  G. Saridis Stochastic approximation methods for identification and control--A survey , 1974 .

[2]  W. Rudin Principles of mathematical analysis , 1964 .

[3]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[4]  Sridhar Srinivasan,et al.  Extracting Structure from Optical Flow Using the Fast Error Search Technique , 2000, International Journal of Computer Vision.

[5]  Pascal Fua,et al.  Regularized Bundle-Adjustment to Model Heads from Image Sequences without Calibration Data , 2000, International Journal of Computer Vision.

[6]  Olivier Faugeras,et al.  3D Dynamic Scene Analysis , 1992 .

[7]  Alex Pentland,et al.  Recursive Estimation of Motion, Structure, and Focal Length , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[9]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[10]  Jeffrey A. Fessler Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): applications to tomography , 1996, IEEE Trans. Image Process..

[11]  Stefano Soatto,et al.  Optimal Structure from Motion: Local Ambiguities and Global Estimates , 2004, International Journal of Computer Vision.

[12]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[13]  Zhengyou Zhang,et al.  Determining the Epipolar Geometry and its Uncertainty: A Review , 1998, International Journal of Computer Vision.

[14]  Narendra Ahuja,et al.  Motion and Structure From Two Perspective Views: Algorithms, Error Analysis, and Error Estimation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Narendra Ahuja,et al.  3-D Motion Estimation, Understanding, and Prediction from Noisy Image Sequences , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[17]  S. B. Kang,et al.  Recovering 3 D Shape and Motion from Image Streams using Non-Linear Least Squares , 1993 .

[18]  Hans-Hellmut Nagel,et al.  Analytical Results on Error Sensitivity of Motion Estimation from Two Views , 1990, ECCV.

[19]  Yiannis Aloimonos,et al.  Statistics Explains Geometrical Optical Illusions , 2001 .

[20]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[21]  A. Murat Tekalp,et al.  Error Characterization of the Factorization Method , 2001, Comput. Vis. Image Underst..

[22]  J. Paris Performance assessment. , 1998, Journal of public health medicine.

[23]  Hans-Hellmut Nagel,et al.  Analytical results on error sensitivity of motion estimation from two views , 1990, Image Vis. Comput..

[24]  Richard Szeliski,et al.  Vision Algorithms: Theory and Practice , 2002, Lecture Notes in Computer Science.

[25]  Donald B. Gennery,et al.  Visual tracking of known three-dimensional objects , 1992, International Journal of Computer Vision.

[26]  Richard Szeliski,et al.  Recovering 3D Shape and Motion from Image Streams Using Nonlinear Least Squares , 1994, J. Vis. Commun. Image Represent..

[27]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[28]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[29]  Narendra Ahuja,et al.  Optimal Motion and Structure Estimation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Kenichi Kanatani,et al.  Unbiased Estimation and Statistical Analysis of 3-D Rigid Motion from Two Views , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Yakup Genc,et al.  Fast and Accurate Algorithms for Projective Multi-Image Structure from Motion , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Rama Chellappa,et al.  3-D Motion Estimation Using a Sequence of Noisy Stereo Images: Models, Estimation, and Uniqueness Results , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  P. Kumar,et al.  Theory and practice of recursive identification , 1985, IEEE Transactions on Automatic Control.

[34]  I. R. Goodman,et al.  Mathematics of Data Fusion , 1997 .

[35]  Rama Chellappa,et al.  Statistical Error Propagation in 3D Modeling From Monocular Video , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[36]  S. Shankar Sastry,et al.  c ○ 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Linear Differential Algorithm for Motion Recovery: A Geometric Approach , 2022 .

[37]  Olivier Faugeras,et al.  Three D-Dynamic Scene Analysis: A Stereo Based Approach , 1992 .

[38]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[39]  Rama Chellappa,et al.  Face reconstruction from monocular video using uncertainty analysis and a generic model , 2003, Comput. Vis. Image Underst..

[40]  Rama Chellappa,et al.  Estimating the Kinematics and Structure of a Rigid Object from a Sequence of Monocular Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Rama Chellappa,et al.  Statistical Analysis of Inherent Ambiguities in Recovering 3-D Motion from a Noisy Flow Field , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Hans-Hellmut Nagel,et al.  The coupling of rotation and translation in motion estimation of planar surfaces , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Rama Chellappa,et al.  Performance bounds for estimating three-dimensional motion parameters from a sequence of noisy images , 1989 .

[44]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[45]  H. Vincent Poor,et al.  An introduction to signal detection and estimation (2nd ed.) , 1994 .

[46]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[47]  John Oliensis,et al.  Dealing with Noise in Multiframe Structure from Motion , 1999, Comput. Vis. Image Underst..

[48]  Azriel Rosenfeld,et al.  Analysis of the least median of squares estimator for computer vision applications , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  John Oliensis,et al.  A Multi-Frame Structure-from-Motion Algorithm under Perspective Projection , 1999, International Journal of Computer Vision.

[50]  Vishvjit S. Nalwa,et al.  A guided tour of computer vision , 1993 .

[51]  Kostas Daniilidis,et al.  Understanding noise sensitivity in structure from motion , 1996 .

[52]  金谷 健一 Statistical optimization for geometric computation : theory and practice , 2005 .

[53]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[54]  John Oliensis,et al.  A Critique of Structure-from-Motion Algorithms , 2000, Comput. Vis. Image Underst..

[55]  Thomas S. Huang,et al.  Estimating three-dimensional motion parameters of a rigid planar patch , 1981 .

[56]  Gaston Lefranc,et al.  Detection of the Movement of an Object From a Sequence of Images , 2000 .

[57]  Peter Meer,et al.  Performance Assessment Through Bootstrap , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Zicheng Liu,et al.  Model-based bundle adjustment with application to face modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[59]  Michael J. Black,et al.  On the unification of line processes, outlier rejection, and robust statistics with applications in early vision , 1996, International Journal of Computer Vision.

[60]  Rama Chellappa,et al.  Towards a criterion for evaluating the quality of 3D reconstructions , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[61]  R. Chellappa,et al.  Recursive 3-D motion estimation from a monocular image sequence , 1990 .