MOTION DETECTION BY CLASSIFICATION OF LOCAL STRUCTURES IN AIRBORNE THERMAL VIDEOS

In this paper we present a method to efficiently detect moving objects namely vehicles in airborne thermal videos. The motion of the sensor is estimated from the optical flow using projective planar homographies as transformation model. A three level classification process is proposed: On the first level we extract interest location applying the Foerstner - operator. These are subject to the second finer level of classification. Here we distinguish four classes: 1. Vehicles cues; 2. L-junctions and other proper fixed structure 3. Tjunctions and other risky fixed structure. 4. A rejection class containing all other locations. This classification is based on local features in the single images. Only structures from the L-junctions class are traced as correspondences through subsequent frames. Based on these the global optical flow is estimated that is caused by the platform movement. The flow is restricted to planar projective homographies which highly reduces the computational time. This opens the way for the third classification. The vehicle class is refined using motion as feature. Inconsistency with the estimated flow is a strong evidence for movement in the scene. This is done by computing a difference image between two sequent frames transformed by a homography to be taken from the same position. The difference images are pre-processed using vehicle properties and velocity.

[1]  Uwe Stilla,et al.  SENSOR POSE INFERENCE FROM AIRBORNE VIDEOS BY DECOMPOSING HOMOGRAPHY ESTIMATES , 2004 .

[2]  Y. Doytsher,et al.  Linear Feature Based Aerial Triangulation , 2004 .

[3]  Olivier Faugeras,et al.  Three-Dimensional Computer Vision , 1993 .

[4]  Hans-Hellmut Nagel,et al.  Combination of Edge Element and Optical Flow Estimates for 3D-Model-Based Vehicle Tracking in Traffic Image Sequences , 1999, International Journal of Computer Vision.

[5]  Hans-Hellmut Nagel,et al.  Volumetric model and 3D trajectory of a moving car derived from monocular TV frame sequences of a street scene , 1981, Comput. Graph. Image Process..

[6]  Jan-Olof Eklundh,et al.  Computer Vision — ECCV '94 , 1994, Lecture Notes in Computer Science.

[7]  Wolfgang Förstner,et al.  A Framework for Low Level Feature Extraction , 1994, ECCV.

[8]  Hans-Hellmut Nagel,et al.  Volumetric model and 3D trajectory of a moving car derived from monocular TV frame sequences of a street scence , 1982, Computer Graphics and Image Processing.

[9]  Michael Felsberg,et al.  An explicit and compact coding of geometric and structural image information applied to stereo processing , 2004, Pattern Recognit. Lett..

[10]  Michael Felsberg,et al.  The monogenic signal , 2001, IEEE Trans. Signal Process..

[11]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[12]  Ines Ernst,et al.  USE OF GIS METHODOLOGY FOR ONLINE URBAN TRAFFIC MONITORING , 2005 .

[13]  Ullrich Köthe,et al.  Edge and Junction Detection with an Improved Structure Tensor , 2003, DAGM-Symposium.

[14]  U. Stilla,et al.  CLASSIFICATION OF LOCAL STRUCTURES IN AIRBORNE THERMAL VIDEOS FOR VEHICLE DETECTION , 2005 .