Bayesian fusion of thermal and visible spectra camera data for mean shift tracking with rapid background adaptation

This paper presents a method for optimally combining pixel information from thermal imaging and visible spectrum colour cameras, for tracking an arbitrarily shaped deformable moving target. The tracking algorithm rapidly re-learns its background models for each camera modality from scratch at every frame. This enables, firstly, automatic adjustment of the relative importance of thermal and visible information in decision making, and, secondly, a degree of “camouflage target” tracking by continuously re-weighting the importance of those parts of the target model that are most distinct from the present background at each frame. Furthermore, this very rapid background adaptation ensures robustness to rapid camera motion. The combination of thermal and visible information is applicable to any target, but particularly useful for people tracking. The method is also important in that it can be readily extended for fusion of data from arbitrarily many imaging modalities.