Multi-image matching using invariant features

This thesis concerns the problems of automatic image stitching and 3D modelling from multiple views. These are basic problems of computer vision, with applications in robotics, architecture, industrial inspection, surveillance, computer graphics and film. Recent work has brought increasing automation to these tasks, but despite a large amount of progress, state-of-the-art algorithms still require some form of user input or assumptions about the image sequence. For example, the best image stitchers currently require an ordered set of input images, or user input to identify the matching images, before automatic registration can proceed. In this work we show how such tasks can be performed automatically and without any user input at all. We formulate the multi-image matching problem as one of finding all matching images, subject to the constraint that they are consistent views from a perspective camera. We use invariant features as a mechanism for finding correspondences, and indexing techniques to efficiently find matches between multiple views. We then find all sets of geometrically consistent feature matches, using a probabilistic model for verification. This allows us to identify each object or scene in the dataset using only the structure already present in the data. The major contributions of this thesis are the development of a system that can automatically recognise and stitch 2D panoramas in unordered image datasets, and a new class of invariant features for this purpose.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Richard Szeliski,et al.  Multi-image matching using multi-scale oriented patches , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  C. A. HART,et al.  Manual of Photogrammetry , 1947, Nature.

[5]  Yoav Y. Schechner,et al.  Addressing radiometric nonidealities: a unified framework , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[7]  Ren Ng Fourier slice photography , 2005, ACM Trans. Graph..

[8]  Steven M. Seitz,et al.  Shape and materials by example: a photometric stereo approach , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Roberto Cipolla,et al.  PhotoBuilder-3D models of architectural scenes from uncalibrated images , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[10]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[11]  Duane C. Brown,et al.  Close-Range Camera Calibration , 1971 .

[12]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[13]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[14]  Paul A. Beardsley,et al.  3D Model Acquisition from Extended Image Sequences , 1996, ECCV.

[15]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[16]  Richard I. Hartley,et al.  Lines and Points in Three Views and the Trifocal Tensor , 1997, International Journal of Computer Vision.

[17]  Wolfgang Heidrich,et al.  High dynamic range display systems , 2004, ACM Trans. Graph..

[18]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[19]  Richard I. Hartley Self-Calibration from Multiple Views with a Rotating Camera , 1994, ECCV.

[20]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[21]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[22]  Gideon P. Stein Accurate internal camera calibration using rotation, with analysis of sources of error , 1995, Proceedings of IEEE International Conference on Computer Vision.

[23]  Luc Van Gool,et al.  Dense matching of multiple wide-baseline views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[25]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Philip H. S. Torr,et al.  Bayesian Model Estimation and Selection for Epipolar Geometry and Generic Manifold Fitting , 2002, International Journal of Computer Vision.

[27]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  P. Anandan,et al.  A computational framework and an algorithm for the measurement of visual motion , 1987, International Journal of Computer Vision.

[29]  Sing Bing Kang,et al.  Characterization of errors in compositing panoramic images , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[31]  David L. Milgram,et al.  Computer Methods for Creating Photomosaics , 1975, IEEE Transactions on Computers.

[32]  Frank Dellaert,et al.  Structure from motion without correspondence , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[33]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[34]  Harpreet S. Sawhney,et al.  3D geometry from planar parallax , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[35]  S. P. Mudur,et al.  Three-dimensional computer vision: a geometric viewpoint , 1993 .

[36]  Chris Harris,et al.  Geometry from visual motion , 1993 .

[37]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[38]  Rachid Deriche,et al.  Using geometric corners to build a 2D mosaic from a set of images , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  Andrew Blake,et al.  Motion Deblurring and Super-resolution from an Image Sequence , 1996, ECCV.

[40]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[41]  Richard Szeliski,et al.  Seamless Stitching using Multi-Perspective Plane Sweep , 2004 .

[42]  Richard Szeliski,et al.  Eliminating ghosting and exposure artifacts in image mosaics , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[43]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[45]  D Marr,et al.  A computational theory of human stereo vision. , 1979, Proceedings of the Royal Society of London. Series B, Biological sciences.

[46]  Adam Baumberg,et al.  Reliable feature matching across widely separated views , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[47]  Richard Szeliski,et al.  Construction of panoramic mosaics with global and lo-cal alignment , 2020 .

[48]  Ross D. Shachter Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams) , 1998, UAI.

[49]  Yann LeCun,et al.  Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation , 1996, Neural Networks: Tricks of the Trade.

[50]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[51]  Richard Szeliski,et al.  Creating full view panoramic image mosaics and environment maps , 1997, SIGGRAPH.

[52]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[53]  Carsten Rother,et al.  Linear Multi View Reconstruction and Camera Recovery Using a Reference Plane , 2002, International Journal of Computer Vision.

[54]  Cordelia Schmid,et al.  3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[55]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[56]  Richard Szeliski,et al.  Direct methods for visual scene reconstruction , 1995, Proceedings IEEE Workshop on Representation of Visual Scenes (In Conjunction with ICCV'95).

[57]  Jon Rigelsford Panoramic Vision: Sensors, Theory and Applications , 2002 .

[58]  Leslie G. Ungerleider,et al.  The Representation of Objects in the Human Occipital and Temporal Cortex , 2000, Journal of Cognitive Neuroscience.

[59]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[60]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[61]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[62]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[63]  Edward H. Adelson,et al.  A multiresolution spline with application to image mosaics , 1983, TOGS.

[64]  S. B. Kang,et al.  Recovering 3 D Shape and Motion from Image Streams using Non-Linear Least Squares , 1993 .

[65]  Roberto Cipolla,et al.  Real-time tracking of complex structures with on-line camera calibration , 2002, Image Vis. Comput..

[66]  Takeo Kanade,et al.  A Characterization of Inherent Stereo Ambiguities , 2001, ICCV.

[67]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[68]  Matthew A. Brown,et al.  Recognising panoramas , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[69]  David Salesin,et al.  Panoramic video textures , 2005, ACM Trans. Graph..

[70]  Harpreet S. Sawhney,et al.  True multi-image alignment and its application to mosaicing and lens distortion correction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[72]  James Davis,et al.  Mosaics of scenes with moving objects , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[73]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[74]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  David Salesin,et al.  Interactive digital photomontage , 2004, ACM Trans. Graph..

[76]  Marc Pollefeys 3D Modelling from Images , 2000, ECCV 2000.

[77]  Reinhard Koch,et al.  Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[78]  I. Du,et al.  Direct Methods , 1998 .

[79]  Philip H. S. Torr,et al.  The Development and Comparison of Robust Methods for Estimating the Fundamental Matrix , 1997, International Journal of Computer Vision.

[80]  Andrew W. Fitzgibbon,et al.  Image-Based Rendering Using Image-Based Priors , 2005, International Journal of Computer Vision.

[81]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[82]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[83]  David A. Forsyth,et al.  Planar object recognition using projective shape representation , 1995, International Journal of Computer Vision.

[84]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[85]  Shenchang Eric Chen,et al.  QuickTime VR: an image-based approach to virtual environment navigation , 1995, SIGGRAPH.

[86]  William T. Freeman,et al.  Shape-time photography , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[87]  Jan Flusser,et al.  Construction of Complete and Independent Systems of Rotation Moment Invariants , 2003, CAIP.

[88]  Andrew Zisserman,et al.  Automated mosaicing with super-resolution zoom , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[89]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[90]  Bill Triggs,et al.  Detecting Keypoints with Stable Position, Orientation, and Scale under Illumination Changes , 2004, ECCV.

[91]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[92]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[93]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[94]  Andrew Zisserman,et al.  Feature Based Methods for Structure and Motion Estimation , 1999, Workshop on Vision Algorithms.

[95]  Philip F. McLauchlan,et al.  Image mosaicing using sequential bundle adjustment , 2002, Image Vis. Comput..

[96]  David A. Forsyth,et al.  Canonical Frames for Planar Object Recognition , 1992, ECCV.

[97]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.