Advances in Model-Based Traffic Vision

Model based vision allows use of prior knowledge of the shape and appearance of specific objects to be used in the interpretation of a visual scene; it provides a powerful and natural way to enforce the view consistency constraint [I]. A model based vision system has been developed within ESPRIT VIEWS: P2152 which is able to classify and track moving objects (cars and other vehicles) in complex, cluttered traffic scenes. The fundamental basis of the method has been previously reported [2]. This paper presents recent developments which have extended the scope of the system to include (i) multiple cameras, (ii) variable camera geometry, and (Hi) articulated objects. All three enhancements have easily been accommodated within the original model-based approach. 1 Review of methods The models used consist of 3D geometrical representations of known objects (vehicles) together with calibrated camera and scene models [3]. Using the known camera and scene geometry, and given a provisional position and orientation (derived from data-driven detection of temporal change in the image), a 3D object can be instantiated into the 2D image plane and a "goodness-of-fit" score obtained by comparing the modelled features with the image. An iterative search in position-space and orientation-space is then used to maximize this evaluation score. At each step in the search the model is re-instantiated into the scene and a new goodness-of-fit score evaluated.