Appearance modeling under geometric context for object recognition in videos

Object recognition is a very important high-level task in surveillance applications. This dissertation focuses on building appearance models for object recognition and exploring the relationship between shape and appearance for two key types of objects, human and vehicle. The dissertation proposes a generic framework that models the appearance while incorporating certain geometric prior information, or the so-called geometric context. Then under this framework, special methods are developed for recognizing humans and vehicles based on their appearance and shape attributes in surveillance videos. The first part of the dissertation presents a unified framework based on a general definition of geometric transform (GeT) which is applied to modeling object appearances under geometric context. The GeT models the appearance by applying designed functionals over certain geometric sets. GeT unifies Radon transform, trace transform, image warping etc. Moreover, five novel types of GeTs are introduced and applied to fingerprinting the appearance inside a contour. They include GeT based on level sets, GeT based on shape matching, GeT based on feature curves, GeT invariant to occlusion, and a multi-resolution GeT (MRGeT) that combines both shape and appearance information. The second part focuses on how to use the GeT to build appearance models for objects like walking humans, which have articulated motion of body parts. This part also illustrates the application of GeT for object recognition, image segmentation, video retrieval, and image synthesis. The proposed approach produces promising results when applied to automatic body part segmentation and fingerprinting the appearance of a human and body parts despite the presence of non-rigid deformations and articulated motion. It is very important to understand the 3D structure of vehicles in order to recognize them. To reconstruct the 3D model of a vehicle, the third part presents a factorization method for structure from planar motion (SfPM). Experimental results show that the algorithm is accurate and fairly robust to noise and inaccurate calibration. Differences and the dual relationship between planar motion and planar object are also clarified. Based on our method, a fully automated vehicle reconstruction system has been designed.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Timothy F. Cootes,et al.  Statistical models of appearance for medical image analysis and computer vision , 2001, SPIE Medical Imaging.

[3]  Alexander Kadyrov,et al.  Affine invariant features from the trace transform , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Rama Chellappa,et al.  Appearance modeling under geometric context , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Takeo Kanade,et al.  A Paraperspective Factorization Method for Shape and Motion Recovery , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  P. Anandan,et al.  Factorization with Uncertainty , 2000, International Journal of Computer Vision.

[7]  Shaohua Kevin Zhou,et al.  A comparison of subspace analysis for face recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[8]  Alexander Kadyrov,et al.  The Trace Transform and Its Applications , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Harry Shum,et al.  Constrained planar motion analysis by decomposition , 2004, Image Vis. Comput..

[10]  Geoffrey D. Sullivan,et al.  A Simple, Intuitive Camera Calibration Tool for Natural Images , 1994, BMVC.

[11]  Rama Chellappa,et al.  Identification of humans using gait , 2004, IEEE Transactions on Image Processing.

[12]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Daniel D. Morris,et al.  Factorization methods for structure from motion , 1998, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[14]  Rui Li,et al.  Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[15]  Shigang Li,et al.  Determining of camera rotation from vanishing points of lines on horizontal planes , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[16]  Rama Chellappa,et al.  Estimation of illuminant direction, albedo, and shape from shading , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Matthew Brand,et al.  Incremental Singular Value Decomposition of Uncertain Data with Missing Values , 2002, ECCV.

[18]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[19]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[20]  Mei Han,et al.  Reconstruction of a Scene with Multiple Linearly Moving Objects , 2004, International Journal of Computer Vision.

[21]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[22]  Hans-Hellmut Nagel,et al.  Model-based object tracking in monocular image sequences of road traffic scenes , 1993, International Journal of Computer 11263on.

[23]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[24]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[25]  A. Murat Tekalp,et al.  Error Characterization of the Factorization Method , 2001, Comput. Vis. Image Underst..

[26]  Rama Chellappa,et al.  Characterization of Human Faces under Illumination Variations Using Rank, Integrability, and Symmetry Constraints , 2004, ECCV.

[27]  Shaohua Kevin Zhou,et al.  Probabilistic face recognition from compressed imagery , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  Rama Chellappa,et al.  Structure From Planar Motion , 2006, IEEE Transactions on Image Processing.

[29]  George Wolberg,et al.  Digital image warping , 1990 .

[30]  Tieniu Tan,et al.  3D structure and motion estimation from 2D image sequences , 1993, Image Vis. Comput..

[31]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[32]  C. Chui Wavelets: A Tutorial in Theory and Applications , 1992 .

[33]  Azriel Rosenfeld,et al.  Fast two-frame multiscale dense optical flow estimation using discrete wavelet filters. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[34]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[36]  John Oliensis Structure from linear or planar motions , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  M. Glas,et al.  Principles of Computerized Tomographic Imaging , 2000 .

[39]  Peter F. Sturm,et al.  A Factorization Based Algorithm for Multi-Image Projective Structure and Motion , 1996, ECCV.

[40]  Ronen Basri,et al.  Shape representation and classification using the Poisson equation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[41]  L. Ehrenpreis The Universality of the Radon Transform , 2003 .

[42]  Larry S. Davis,et al.  Pedestrian classification from moving platforms using cyclic motion pattern , 2005, IEEE International Conference on Image Processing 2005.

[43]  José M. F. Moura,et al.  Rank 1 Weighted Factorization for 3D Structure Recovery: Algorithms and Performance Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Philip N. Klein,et al.  Recognition of shapes by editing their shock graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  René Vidal,et al.  Structure from Planar Motions with Small Baselines , 2002, ECCV.

[46]  Chandra Kambhamettu,et al.  NONRIGID POINT CORRESPONDENCE RECOVERY FOR PLANAR CURVES U SING FOURIER DECOMPOSITION , 2004 .

[47]  Sudeep Sarkar,et al.  The gait identification challenge problem: data sets and baseline algorithm , 2002, Object recognition supported by user interaction for service robots.

[48]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[49]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[50]  Michal Irani,et al.  Multi-Frame Correspondence Estimation Using Subspace Constraints , 2002, International Journal of Computer Vision.

[51]  Remco C. Veltkamp,et al.  State of the Art in Shape Matching , 2001, Principles of Visual Information Retrieval.

[52]  Lihi Zelnik-Manor,et al.  Multi-Frame Estimation of Planar Motion , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Haibin Ling,et al.  Using the inner-distance for classification of articulated shapes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[54]  Azriel Rosenfeld,et al.  Accurate dense optical flow estimation using adaptive structure tensors and a parametric model , 2003, IEEE Trans. Image Process..

[55]  Rama Chellappa,et al.  Image-based face recognition under illumination and pose variations. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[56]  Wen-Hsiang Tsai,et al.  Camera Calibration by Vanishing Lines for 3-D Computer Vision , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[58]  Ronen Basri,et al.  Lambertian reflectance and linear subspaces , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.