Evidential reasoning framework for object tracking

Object tracking consists of reconstructing the configuration of an articulated body from a sequence of images provided by one or more cameras. In this paper we present a general method for pose estimation based on the evidential reasoning. The proposed framework integrates different levels of description of the object to improve robustness and precision, overcoming the limitations of approaches using single-feature representations. Several image descriptions extracted from a single-camera view are fused together using the Dempster-Shafer `theory of evidence'. Feature data are expressed as belief functions over the set of their possible values. There is no need of any a-priori assumptions about the model of the object. Learned refinement maps between feature spaces and the parameter space Q describing the configuration of the object characterize the relationships among distinct representations of the pose and play the role of the model. During training the object follows a sample trajectory in Q. Each feature space is reduced to a discrete frame of discernment (FOD) and refinements are built by mapping these FODs into subsets of the sample trajectory. During tracking new sensor data are converted to belief functions which are projected and combined in the approximate state space. Resulting degree of belief indicate the best pose estimate at the current time step. The choice of a sufficiently dense (in a topological sense) sample trajectory is a critical problem. Experimental results concerning a simple tracking system are shown.